Revision history for Email-Abuse-Investigator

0.08	Wed May 13 19:05:54 EDT 2026

  [Enhancements]

  - Added test dashboard at https://nigelhorne.github.io/Email-Abuse-Investigator/coverage/

  - Added fuzz testing

  - domain_expires_soon logic was broken, since it ignored the seconds field

  - Added -i / --interactive flag to submit_abuse_report.pl, modelled on
    rm -i.  When set, the script prompts for confirmation before sending
    each abuse report.  The recipient address and the reason for contacting
    them (the role string) are displayed, followed by a "Send? [y/N]"
    prompt.  Only reports answered with "y" are sent; all others are
    skipped and noted in the output but not counted as failures.  Has no
    effect in --dry-run mode.  Reads confirmation from /dev/tty directly
    via a new _read_tty() helper sub so the prompt works correctly when
    the email is supplied on stdin, and so the prompt logic is testable
    without a pseudo-terminal.

  - Added t/submit_script.t: a new black-box test suite for
    submit_abuse_report.pl that runs the script as a subprocess via
    IPC::Open3.  Covers --dry-run output, --interactive with --dry-run
    (no prompt shown), the no-contacts unresolved domain listing, spoofed
    From: domain exclusion from the unresolved list, and --help
    documentation of --interactive.  The suite skips automatically if the
    script's dependencies are not installed, so it is safe to add to the
    distribution without breaking CI on minimal Perl installs.

  - Added abuse contacts for major URL shorteners to %PROVIDER_ABUSE.
    Previously URL shorteners were only flagged as a risk indicator;
    the module fell back to WHOIS which returned the registrar (e.g.
    Gandi for is.gd), who correctly responded that they have no control
    over how the shortener service is used.  The shortener operator is
    the right contact.  Entries added: is.gd, bit.ly/bitly.com,
    tinyurl.com, ow.ly (Hootsuite), buff.ly (Buffer), rb.gy, cutt.ly,
    shorturl.at.

  - Added major delivery company domains (fedex.com, ups.com, dhl.com,
    usps.com, royalmail.com) to %TRUSTED_DOMAINS.  These domains appear
    as URL hosts in delivery-impersonation spam (the spammer links to
    the real fedex.com to make the message look legitimate) but the
    delivery company is the victim of impersonation, not a party to
    report.  Their registrar (CSC Global) was generating false positive
    abuse contacts.

  - Added Dynadot to %PROVIDER_ABUSE as a form-only entry.  Dynadot
    explicitly rejects email abuse reports per their autoresponse,
    directing reporters to their web form instead:
      dynadot.com -> https://www.dynadot.com/report-abuse
    Discovered via a real autoresponse received during testing.

  - Added role string display cap to abuse_contacts().  When multiple
    distinct routes converge on the same abuse address and the joined
    role string would exceed 80 characters, it is summarised as
    "N routes: type1, type2, ..." (e.g. "4 routes: Sending ISP, URL
    host, Account provider, DKIM signer").  The full detail is always
    available via the roles arrayref for callers that need it; only the
    role (singular) display string is capped.  This keeps the dry-run
    footer and live-run output readable when Google or Microsoft is
    identified via four or more independent discovery routes.

  - Added unresolved_contacts() public method to Email::Abuse::Investigator.
    Returns a list of hashrefs describing domains and URL hosts found in
    the message for which no abuse contact could be determined -- i.e.
    they are not in %PROVIDER_ABUSE and produced no usable result from
    IP or domain WHOIS.  Domains whose only source is a spoofable
    sending header (From:, Return-Path:, Sender:) are excluded, as are
    domains already covered by abuse_contacts() or form_contacts().
    Each hashref contains domain, type (url_host or domain), and source.
    submit_abuse_report.pl now delegates its _print_unresolved() helper
    to this method rather than reimplementing the filtering logic inline.

  - Added new constructor options to allow per-object override or
    replacement of the three built-in lookup tables:
    provider_abuse
    trusted_domains
    url_shorteners
    The behaviour is merge (caller entries added on top of the built-in defaults).
    All options are also readable from an Object::Configure
    configuration file.  The three tables are now stored per-object
    so two objects with different overrides are fully independent.

  - Replaced alarm()-based read timeout in _raw_whois() with IO::Select
    so that WHOIS queries time out reliably on Windows and threaded Perls.
    The magic numbers 43 (WHOIS port) and 4096 (read chunk) are now
    Readonly constants $WHOIS_PORT and $WHOIS_READ_CHUNK.

  - Added --bcc [ADDRESS] option to submit_abuse_report.pl.  When given,
    a blind carbon copy of every outgoing report is sent to ADDRESS.  If
    ADDRESS is omitted, the copy goes to the --from address, which is the
    common case for keeping a personal record of what was sent.  The BCC
    is implemented as a second SMTP RCPT TO envelope recipient with no
    Bcc: header in the message, so the primary abuse contact never sees
    the monitoring address.  In --dry-run mode the BCC address is shown
    in the output header but no mail is sent.

  When the analysis finds domains or URL hosts in the message but cannot determine
    an abuse contact for them, they are now listed so the user knows where
    to look for manual follow-up.  The list appears in three places: after
    the "No abuse contacts could be determined" message, at the end of
    --dry-run output, and at the end of a live run summary.  Domains whose
    only source is a spoofable sending header (From:, Return-Path:,
    Sender:) are excluded -- these are innocent victims of address forgery
    rather than parties to investigate.  Domains already covered by
    abuse_contacts() or form_contacts() are also excluded to avoid
    redundancy.  Discovered via a spam from nced.edu.kw (spoofed) that
    contained mailto:gyomu@tolde.co.jp and http://www.toolde.co.jp in the
    body -- both genuine spam contact points that previously produced no
    output at all.

    - Added cross-message CHI cache to avoid redundant network lookups across
    multiple messages processed in the same run.  A shared in-memory CHI
    instance (TTL 1 hour) is initialised on first call to new() when CHI
    is installed.  IP WHOIS results are cached under "whois_ip:$ip",
    domain analysis under "dom:$domain", and DNS resolution under
    "resolve:$host".  Failed DNS lookups are cached as an empty string
    and returned as undef on subsequent calls so the resolver is not
    retried.  Gracefully degrades to the existing per-message cache when
    CHI is not installed.

  - Added IPv6 support throughout.  The @PRIVATE_RANGES table now covers
    fe80::/10 (link-local), 2001:db8::/32 (RFC 3849 documentation range),
    and 64:ff9b::/96 (NAT64 well-known prefix), in addition to the
    loopback and ULA ranges already present.  @RECEIVED_IP_RE now includes
    a bracketed IPv6 pattern so IPv6 addresses are extracted from
    Received: headers.  _extract_ip_from_received() accepts colon-
    containing addresses without IPv4 validation.  _resolve_host() tries
    an AAAA query after a failed A query when Net::DNS is available.
    _raw_whois() uses IO::Socket::IP (dual-stack) in preference to
    IO::Socket::INET when that module is installed.

  - Added multipart recursion guard.  _decode_multipart() now accepts a
    $depth parameter (starting at 0 from _split_message()).  When depth
    reaches MAX_MULTIPART_DEPTH (Readonly constant, value 20) the method
    carps and returns immediately rather than recursing further, preventing
    stack exhaustion on pathological crafted messages with deeply nested
    MIME structures.

  - Added Domain::PublicSuffix support to _registrable().  When
    Domain::PublicSuffix is installed, get_root_domain() is used for
    accurate eTLD+1 normalisation covering the full Public Suffix List.
    The existing heuristic (handling co.uk, com.au, and similar common
    two-label ccTLD second-levels) is retained as a fallback when the
    module is absent.

  - Added parallel DNS resolution via AnyEvent::DNS.  When AnyEvent::DNS
    is installed and a message contains more than one unique URL hostname,
    _extract_and_resolve_urls() fires all A queries concurrently via a
    condvar and pre-populates the host cache before the sequential
    enrichment loop runs.  Falls back transparently to sequential
    resolution when AnyEvent::DNS is not installed or the host list
    contains only one entry.

  - Added input sanitisation to parse_email().  The raw message text is
    stripped of characters outside [\x09\x0A\x0D\x20-\x7E\x80-\xFF]
    (i.e. C0 controls other than tab, LF, and CR, and the DEL character)
    before storage in _raw and header parsing.  High bytes (0x80-0xFF)
    are preserved to avoid corrupting valid UTF-8 content in headers and
    bodies.

  - Added _sanitise_output() private function.  Strips C0 control
    characters (0x01-0x08, 0x0B, 0x0C, 0x0E-0x1F) and DEL (0x7F) from
    any user-derived string before it is written to report() or
    abuse_report_text() output.  Tabs, LF, and CR are preserved.  High
    bytes (0x80-0xFF) are preserved for UTF-8 content.  Applied to all
    user-derived fields: IP info, organisation names, registrar names,
    flag detail strings, and header values.

  - Added Object::Configure integration to new().  The constructor now
    calls Object::Configure::configure($class, $params) after parameter
    validation, allowing per-class defaults to be loaded from a
    configuration file.  The returned hashref overlays the caller-supplied
    parameters before the object is blessed.

  - Added new Readonly constants: $MAX_MULTIPART_DEPTH (20),
    $CACHE_TTL_SECS (3600), $DEFAULT_TIMEOUT (10), $WHOIS_PORT (43),
    $WHOIS_READ_CHUNK (4096), $WHOIS_RAW_MAX (2048), $RECENT_REG_DAYS
    (180), $EXPIRY_WARN_DAYS (30), $SECS_PER_DAY (86400),
    $DATE_SKEW_DAYS (7), $TZ_MAX_POS_MINS (840), $TZ_MAX_NEG_MINS (720),
    $SCORE_HIGH (9), $SCORE_MEDIUM (5), $SCORE_LOW (2), %FLAG_WEIGHT,
    $ROLE_MAX_LEN (80), $ROLE_WRAP_LEN (66).  All previously magic
    numbers have been removed from the code body.

  [Bug Fixes]

  - Fixed NumericBoundary mutator in SchemaExtractor incorrectly treating
    the < operator in file open (open $fh, '<', $path) and readline
    (<$fh>) expressions as numeric comparisons, generating spurious
    mutant variants for those operators.

  - Wrap in eval to catch 'Connection reset by peer' thrown by Fatal/autodie

0.07	Mon Mar 30 08:45:04 EDT 2026

  Bug fixes

  - Added Route 7 to abuse_contacts(): reply addresses found in the message
    body are now looked up in the provider table and generate abuse contacts.
    This catches a common advance-fee and investment scam pattern where the
    From: and Return-Path: headers use a spoofed or compromised address (the
    innocent victim of the fraud), while the real contact address -- a free
    webmail account chosen by the spammer -- is mentioned explicitly in the
    body text, e.g. "contact profcindyinvestments@hotmail.com for details".
    The %TRUSTED_DOMAINS filter is intentionally bypassed for this route:
    being hosted on a trusted provider (Hotmail, Gmail) is what makes these
    addresses actionable, since the spammer chose free webmail precisely
    because it is accessible and anonymous.  Recipient domains (To:, Cc:)
    are still excluded.  Only domains present in %PROVIDER_ABUSE generate
    contacts via this route; unknown domains in the body are ignored.
    The role string includes the specific address found, e.g.
    "Reply address in body (profcindyinvestments@hotmail.com)", so the
    abuse desk knows exactly which account to investigate.

  - Fixed abuse_contacts() generating registrar contacts for innocent domains
    that appear only in spoofable sending headers (From:, Return-Path:,
    Sender:).  These headers are trivially forged; when a spammer uses a
    victim's address as the envelope sender, reporting the victim's domain
    registrar is both unhelpful and potentially harmful to the innocent party.
    The registrar contact is now suppressed when the domain's only source is
    one of these three headers AND the same domain does not also appear as a
    URL host.  If the From: domain appears in a URL as well, the spammer
    controls it and the registrar contact is retained.  Domains sourced from
    Reply-To:, DKIM-Signature:, List-Unsubscribe:, Message-ID:, or the
    message body are unaffected -- those all indicate deliberate spammer
    choice.  Discovered via a real advance-fee spam where qwestoffice.net
    was spoofed as the sender but profcindyinvestments@hotmail.com was the
    actual reply address, causing a false report to CSC Global (registrar
    of qwestoffice.net).

  - Fixed abuse_contacts() generating spurious account-provider contacts
    from SRS-rewritten Return-Path: and Sender: headers.  SRS (Sender
    Rewriting Scheme) addresses are generated by mail forwarders to preserve
    SPF validity and take the form:
      localpart+SRS=hash=timestamp=orig-domain=orig-local@forwarder
    The forwarding domain is not responsible for the spam content and is a
    false abuse target.  Route 4 of abuse_contacts() and form_contacts() now
    skips any addr-spec whose local part matches +SRS= or +SRS0= (case-
    insensitive), covering both the standard SRS0 form and the re-forwarded
    SRS1 variant.  Discovered via a real spam message forwarded through
    groups.outlook.com, which was generating an unwanted
    "Account provider (return-path: ...@groups.outlook.com)" role and
    routing an abuse report to abuse@microsoft.com via the wrong route.

  - Fixed false positive http_not_https risk flag and spurious Gandi abuse
    contacts caused by W3C namespace and DTD URLs in HTML email templates.
    Spam messages sent as HTML frequently contain boilerplate references such
    as:
      http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
      http://www.w3.org/1999/xhtml
    These are injected by the ESP's HTML template engine and have no
    connection to the spam content.  Two fixes applied:
    1. Added w3.org to %TRUSTED_DOMAINS so it is filtered from the domain
       intelligence pipeline and does not generate abuse contacts.
    2. Added a trusted-domain skip at the top of the URL check loop in
       risk_assessment() so that http:// references to trusted domains do
       not raise the http_not_https flag.  The skip uses the same $bare
       (www-stripped) hostname already computed for the shortener check.
    Discovered via a Gandi autoresponse explaining that W3C receives a high
    volume of false positive abuse reports for this reason and maintains a
    FAQ at https://www.w3.org/Help/Webmaster#spam.
    Note: w3c.org (the consortium's secondary domain) is deliberately not
    added -- it rarely appears in HTML boilerplate and any w3c.org URL in a
    spam message would be unusual enough to warrant investigation.

0.06	Sun Mar 29 11:28:02 EDT 2026

  New features

  - Added form_contacts() public method, parallel to abuse_contacts(), which
    returns the set of parties that require abuse reports to be submitted via
    a web form rather than email.  Returns a list of hashrefs each containing
    the form URL, role, note, and instructions on what to paste and what file
    to upload.  Providers are identified via a new optional 'form',
    'form_paste', and 'form_upload' key in %PROVIDER_ABUSE entries.

  - Added MarkMonitor and Global Domain Group to %PROVIDER_ABUSE as
    form-only entries (no 'email' key).  Both registrars explicitly reject
    email abuse reports per their autoresponse:
      markmonitor.com ->
        https://corp.markmonitor.com/domain/ui/abuse-report
      globaldomaingroup.com ->
        https://globaldomaingroup.com/report-abuse
    The form_paste and form_upload hints in each entry tell the user exactly
    what to paste into the form and what file to attach.

  - Added [ WHERE TO FILE WEB-FORM REPORTS ] section to report(), appearing
    after [ WHERE TO SEND ABUSE REPORTS ] when form_contacts() returns
    results.  Each entry shows the form URL, role, paste instructions
    (word-wrapped at 66 characters), and upload instructions.

  - Added WEB-FORM REPORTS REQUIRED: block to abuse_report_text() listing
    form-only contacts with their form URL, paste hint, and upload hint.

  - Added MANUAL ACTION REQUIRED -- WEB FORM SUBMISSION block to the
    --dry-run footer of submit_abuse_report, listing form-only contacts
    separately from email recipients so the user knows which parties require
    manual browser submission.

  Bug fixes / refactoring

  - Fixed abuse_contacts() passing form-only provider addresses (e.g.
    abusecomplaints@markmonitor.com) through to the email contact list.
    Such addresses are syntactically valid but explicitly non-functional per
    the provider's own autoresponse.  The $add closure now checks whether the
    address domain belongs to a form-only %PROVIDER_ABUSE entry (one with a
    'form' key but no 'email' key) and suppresses it; the contact is surfaced
    correctly via form_contacts() instead.

  - Refactored the domain WHOIS fallback in _extract_and_resolve_urls() to
    call _analyse_domain() instead of the separate _parse_domain_whois_abuse()
    helper introduced in 0.05.  _analyse_domain() is already cached in
    $self->{_domain_info}, so when the same spam domain appears in multiple
    URLs the domain WHOIS is performed only once rather than once per unique
    hostname.  _parse_domain_whois_abuse() has been removed.

  - Added use utf8; pragma to Investigator.pm so that the em-dash characters
    in user-facing detail strings are handled correctly under all Perl
    configurations without relying on the source file's implicit encoding.

  - Added GoDaddy to %PROVIDER_ABUSE as a form-only entry.  GoDaddy
    explicitly rejects email abuse reports per their autoresponse, directing
    reporters to their web form instead:
      godaddy.com -> https://supportcenter.godaddy.com/AbuseReport
    The form_upload hint reflects that GoDaddy accepts screenshots or PDFs,
    not .eml files.

  - Added googlegroups.com and groups.google.com to %TRUSTED_DOMAINS.
    Google Groups message IDs (e.g. @googlegroups.com) were entering the
    domain intelligence pipeline and triggering a MarkMonitor web-form
    contact, since Google uses MarkMonitor to register its infrastructure
    domains.  This was a false positive -- MarkMonitor cannot act on a
    Google-owned domain.  Both domains are now filtered at the same point
    as google.com, gmail.com, and other trusted Google infrastructure.

  - Added form_domain field to form_contacts() hashrefs, surfaced as
    Domain/URL in report(), Domain in abuse_report_text(), and Domain in
    the --dry-run footer of submit_abuse_report.  Providers such as
    MarkMonitor and GoDaddy have a dedicated "Domain or URL" field in their
    web forms; this field gives the user the exact value to paste into it
    without having to work it out from context.

  - Improved clarity of role strings in abuse_contacts():
    1. Removed the "(provider table)" suffix from Sending ISP, URL host,
       Web host, and DKIM signer roles.  This was implementation detail
       that added length without helping the recipient understand what action
       to take.  "Sending ISP" is clearer than "Sending ISP (provider
       table)"; the via field already records how the contact was found.
    2. Included the hostname in URL host roles -- "URL host: host.example"
       rather than the generic "URL host".  When multiple routes converge on
       the same abuse address (e.g. a Blogspot URL and a Gmail sending IP
       both map to abuse@google.com), the merged role now makes clear which
       specific URL is being reported, giving the abuse team an actionable
       reference without requiring them to read the full report headers.
    3. Stripped the display name from Account provider roles.  The role
       previously included the full From: header value including the
       spammer's chosen display name (e.g. "Account provider (from: Evil
       Spammer <spam@gmail.com>)").  The display name is irrelevant to the
       abuse report, may contain non-ASCII characters, and makes the merged
       role string much longer than necessary.  The role now shows only the
       email address: "Account provider (from: spam@gmail.com)".

  - Fixed Wide character in syswrite error when submitting reports for
    messages whose decoded subject line contains non-ASCII characters (e.g.
    emoji).  submit_abuse_report uses "use utf8" and
    "use open qw(:std :encoding(UTF-8))", which causes all string operations
    to produce Unicode-flagged strings.  Net::SMTP->datasend() calls
    syswrite() on a raw socket and cannot handle wide characters.
    _build_mime_message() now calls Encode::encode('UTF-8', ...) on the
    report body and original message after CRLF normalisation, converting
    the internal Unicode strings to raw byte strings before transmission.
    Encode is a core module since Perl 5.8 so no new dependency is added.

0.05	Sat Mar 28 11:52:05 EDT 2026

  Bug fixes

  - Fixed _extract_and_resolve_urls() discarding the registrar abuse
    contact for URL hosts that cannot be resolved to an IP at analysis
    time.  Previously, when _resolve_host() returned undef, _whois_ip()
    was skipped entirely and the host was recorded with abuse=>'(unknown)',
    which caused abuse_contacts() to produce no contact for that host even
    though a domain WHOIS record (and therefore a registrar abuse address)
    existed.  _extract_and_resolve_urls() now falls back to a domain WHOIS
    lookup on the registrable parent of the host when the IP WHOIS yields
    no abuse address.  A new private helper _parse_domain_whois_abuse()
    performs this lookup without the full overhead of _analyse_domain().
    Combined with the protocol-relative URL fix above, this means that the
    badshamart.com spam campaign (PBS Health News / prostate supplement)
    now correctly produces a registrar abuse contact in abuse_contacts()
    even though all four badshamart.com URL hosts were unresolvable.

  - Fixed _extract_http_urls() not extracting protocol-relative URLs
    (scheme-omitted form //domain/path).  These are used in spam messages
    as tracking pixels and click-redirect links, e.g.:
      <img src="//badshamart.com/o/2516/19142/347/US" ...>
    The leading // was not matched by either the https?:// absolute-URL
    regex or the HTML::LinkExtor filter, which also required a full scheme.
    Both passes now recognise the //domain form and normalise it to
    https://domain before adding it to the URL list.  The regex pass
    anchors the match to whitespace, quotes, or = to avoid false positives
    on CSS path segments and HTML comments.
    Discovered via a real spam message (PBS Health News / badshamart.com)
    where three click-redirect hrefs and one tracking-pixel src all used
    protocol-relative URLs, causing badshamart.com to be entirely absent
    from embedded_urls() and therefore from abuse_contacts().

  - Fixed duplicate Salesforce Marketing Cloud comment block in
    %PROVIDER_ABUSE.  A leftover comment fragment introduced during 0.03
    appeared immediately before the real Salesforce entries, causing
    cosmetic confusion in the source.  Removed the orphaned fragment.

  - Fixed two stale references to Mail::Message::Abuse in the SUPPORT POD
    section: the perldoc command example and the CPAN Testers Dependencies
    URL both still named the old module.  Both now correctly reference
    Email::Abuse::Investigator.

  New features

  - Added Blogger/Blogspot and Google Sites to the built-in provider table
    alongside the existing Google entries:
      blogspot.com       -> abuse@google.com
      blogger.com        -> abuse@google.com
      sites.google.com   -> abuse@google.com
    Blogspot is one of the most commonly abused free hosting platforms for
    spam landing pages.  Subdomains (e.g. ruseriver.blogspot.com) are
    resolved to blogspot.com by the existing subdomain-stripping logic.
    Note: google.com is in %TRUSTED_DOMAINS and is therefore excluded from
    the domain intelligence pipeline; these entries are effective via the
    URL-host and account-provider lookup routes in abuse_contacts().

  - Documented that the {logger} constructor slot may be populated by
    Object::Configure from a configuration file, allowing log output to
    be routed through any Log::* compatible logger rather than STDERR.

0.04	Fri Mar 27 22:01:05 EDT 2026

  Bug fixes

  - Fixed abuse_contacts() silently discarding discovery routes that resolve
    to an address already seen.  When the same abuse address is found via
    multiple routes (e.g. Google as both the sending ISP via rDNS and the
    owner of a blogspot.com URL in the body), the second and subsequent
    roles are now accumulated rather than dropped.  Each hashref in the
    returned list gains a 'roles' arrayref holding the individual role
    strings, and 'role' (singular) is set to their join(' and ', ...) for
    backward compatibility.  The dry-run footer in submit_abuse_report
    now reflects this: a merged entry shows both roles on one line and the
    total line reads "N recipients (M contact routes merged)" when merging
    has occurred.

  - Fixed _decode_multipart() not recursing into nested multipart/* parts.
    A message with Content-Type: multipart/mixed containing a nested
    multipart/alternative (a common structure for HTML+plaintext mail) had
    its body silently discarded, causing embedded_urls() to find no URLs
    and abuse_contacts() to miss all URL-host contacts.  _decode_multipart()
    now detects nested multipart/* parts, extracts the inner boundary from
    the Content-Type header, and recurses to decode the inner container.

  - Fixed abuse_contacts() section 4 (account provider lookup) incorrectly
    matching the domain of an @ sign appearing in a display name rather than
    the actual addr-spec.  A From: header of the form:
      "evil@gmail.com" <real@hotmail.com>
    was matching gmail.com instead of hotmail.com.  The addr-spec is now
    extracted from the rightmost angle-bracket pair before the domain is
    parsed; without angle brackets the whole value is used as before.

  New features

  - Added implausible_timezone (MEDIUM, weight 2) risk flag.  Numeric
    timezone offsets in the Date: header are now validated against the
    real-world range of +1400 (Line Islands) to -1200 (Baker Island).
    Offsets outside that range, or with a minutes field >= 60, raise this
    flag.  Positive and negative bounds are checked separately; a symmetric
    limit would wrongly accept values such as -1300.

  - Added Blogger/Blogspot and Google Sites to the built-in provider table:
      blogspot.com       -> abuse@google.com
      blogger.com        -> abuse@google.com
      sites.google.com   -> abuse@google.com
    Blogspot subdomains (e.g. ruseriver.blogspot.com) are handled by the
    existing subdomain-stripping logic.

  - Added ActiveCampaign to the built-in provider table:
      activecampaign.com  -> abuse@activecampaign.com
      ac-tinker.com       -> abuse@activecampaign.com  (tracking domain)

0.03	Fri Mar 27 19:54:32 EDT 2026

  Bug fixes

  - Fixed spurious abuse reports being sent to the registrar or ISP of the
    message recipient.  Bulk mailers routinely embed the recipient's email
    address in the message body (personalisation footers, unsubscribe
    confirmations, "this email was sent to you@example.com" lines).
    _extract_and_analyse_domains() was collecting domains from the body
    without first excluding the To: and Cc: recipients, causing innocent
    parties to receive abuse reports.  The To:, Cc:, and Received: "for"
    envelope-recipient domains are now built into an exclusion set --
    including their registrable eTLD+1 parents -- before any body or header
    scanning takes place.

  - Fixed "no abuse contacts could be determined" when analysing email
    sent via Salesforce Marketing Cloud (ExactTarget).  Three separate
    causes were identified and corrected:

    1. Salesforce Marketing Cloud was absent from the built-in provider
       table.  Added salesforce.com, mc.salesforce.com, exacttarget.com,
       and et.exacttarget.com, all mapping to abuse@salesforce.com.

    2. Non-routable hostnames such as iad4s13mta756.xt.local (injected
       by Salesforce's MTA into the Message-ID) were passing through the
       domain collection pipeline and consuming a WHOIS lookup slot that
       could never return an actionable result.  The $record closure in
       _extract_and_analyse_domains() now rejects any domain whose TLD is
       not at least two alphabetic characters, and explicitly rejects the
       pseudo-TLDs .local, .internal, .lan, .localdomain, and .arpa.

    3. When a message carries multiple DKIM-Signature headers (common
       with ESPs: the first signs for the customer domain, the second
       for the ESP infrastructure), _parse_auth_results_cached() took
       only the first d= tag and stopped.  It now collects all d= domains
       and sets dkim_domain to whichever one has a hit in the provider
       table -- identifying the actionable ESP -- falling back to the
       first if none match.  All collected domains are fed into the
       domain analysis pipeline via the new dkim_domains arrayref in the
       auth results hashref.

  - The --dry-run output of submit_abuse_report now appends a compact
    recipient summary at the foot of the report:

        Total: 2 recipients

          abuse@tpg.com.au (Sending ISP)
          abuse@godaddy.com (Domain registrar for firmluminary.com)

    Previously only the count was shown.  The summary allows a user to
    confirm at a glance who would receive reports without scrolling back
    through the full numbered table.

  - submit_abuse_report now produces fully RFC 5965 (ARF) compliant
    messages.  The MIME structure changed from multipart/mixed (two parts)
    to multipart/report; report-type=feedback-report (three parts):
      Part 1  text/plain                 human-readable abuse report
      Part 2  message/feedback-report    ARF machine-readable metadata
      Part 3  message/rfc822             original spam message verbatim
    The feedback-report part includes Feedback-Type, Version, User-Agent,
    Source-IP, Original-Mail-From, Original-Rcpt-To, Arrival-Date,
    Reported-Domain, Reported-Uri (one per URL), and Authentication-Results.

0.02	Fri Mar 27 19:04:37 EDT 2026
  - Added bin/submit_abuse_report

0.01	Fri Mar 27 14:23:09 EDT 2026
        First draft
