Vous êtes sur la page 1sur 26

SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

spiderfoot.net

SpiderFoot Documentation

About

SpiderFoot is an open source intelligence automation tool. Its goal is to automate the process of
gathering intelligence about a given target, which may be an IP address, domain name, hostname
or network subnet.

SpiderFoot can be used offensively, i.e. as part of a black-box penetration test to gather
information about the target or defensively to identify what information your organisation is
freely providing for attackers to use against you.

Pre-Requisites

Linux/BSD/Solaris

SpiderFoot is written in Python (2.7), so to run on Linux/Solaris/FreeBSD/etc. you need Python


2.7 installed, in addition to the lxml, netaddr, M2Crypto, CherryPy and Mako modules.

To install the dependencies using PIP, run the following:

~$ pip install lxml netaddr M2Crypto cherrypy mako

Other modules such as MetaPDF, SOCKS and more are included in the SpiderFoot package, so
you dont need to install them separately.

Windows

SpiderFoot for Windows is a compiled executable file, and so all dependencies are packaged with
it.

No third party tools/libraries need to be installed, not even Python.

1 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Up to table of contents

Installing

Installing SpiderFoot is literally as simple as unpacking the distribution tar.gz/zip file.

Linux/BSD/Solaris

To install SpiderFoot on Linux/Solaris/FreeBSD/etc. you only need to un-targz the package, as


follows:

~$ tar zxvf spiderfoot-X.X.X-src.tar.gz


~$ cd spiderfoot-X.X.X
~/spiderfoot-X.X.X$

Windows

Unzip the distribution ZIP file to a folder of your choice yep thats it.

Up to table of contents

Starting SpiderFoot

Linux/BSD/Solaris

To run SpiderFoot, simply execute sf.py from the directory you extracted SpiderFoot into:

~/spiderfoot-X.X.X$ python ./sf.py

Once executed, a web-server will be started, which by default will listen on 127.0.0.1:5001. You
can then use the web-browser of your choice by browsing to http://127.0.0.1:5001.

If you wish to make SpiderFoot accessible from another system, for example running it on a server
and controlling it remotely, then you can specify an external IP for SpiderFoot to bind to, or use
0.0.0.0 so that it binds to all addresses, including 127.0.0.1:

2 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

~/spiderfoot-X.X.X$ python ./sf.py 0.0.0.0:5001

If port 5001 is used by another application on your system, you can change the port:

~/spiderfoot-X.X.X$ python ./sf.py 127.0.0.1:9999

Windows

SpiderFoot for Windows comes as a pre-packaged executable, with no need to install any
dependencies.

For now, there is no installer wizard, so all thats needed is to unzip the package into a directory
(e.g. C:\SpiderFoot) and run sf.exe :

C:\SpiderFoot>sf.exe

As with Linux, you can also specify the IP and port to bind to:

C:\SpiderFoot>sf.exe 0.0.0.0:9999

Caution!

By default, SpiderFoot does not authenticate users connecting to its user-interface or serve over
HTTPS, so avoid running it on a server/workstation that can be accessed from untrusted devices,
as they will be able to control SpiderFoot remotely and initiate scans from your devices. As of
SpiderFoot 2.7, to use authentication and HTTPS, see the Security section below.

Up to table of contents

Security

With version 2.7, SpiderFoot introduced authentication as well as TLS/SSL support. These are
automatic based on the presence of specific files.

Authentication

3 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

SpiderFoot will require basic digest authentication if a file named passwd exists in the
SpiderFoot root directory. The format of the file is simple - just create an entry per account, in the
format of:

username:password

For example:

admin:supersecretpassword

Once the file is created, restart SpiderFoot.

TLS/SSL

SpiderFoot will serve HTTPS (and only that) if it detects the existence of a public certificate and
key file in SpiderFoots root directory. This means whatever port you set SpiderFoot to listen on is
the port TLS/SSL will be used. It is not possible for SpiderFoot to serve both HTTP and HTTPS
simultaneously on different ports. If you need to do that, an nginx proxy in front of SpiderFoot
would be a better solution.

Simply place two files in the SpiderFoot directory - spiderfoot.crt (RSA public key in PEM
format) and spiderfoot.key (RSA private key in PEM format). Restart SpiderFoot and you
will now be serving HTTPS only.

For instructions on generating a self-signed certificate, check out this StackOverflow article.

Up to table of contents

API Keys

A few SpiderFoot modules require or perform better when API keys are supplied.

Honeypot Checker

1. Go to http://www.projecthoneypot.org

4 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

2. Sign up (free) and log in

3. Click Services -> HTTP Blacklist

4. An API key should be listed

5. Copy and paste that key into the Settings -> Honeypot Checker section in SpiderFoot

SHODAN

1. Go to http://www.shodanhq.com

2. Sign up (free) and log in

3. Click Developer Center

4. On the far right your API key should appear in a box

5. Copy and paste that key into the Settings -> SHODAN section in SpiderFoot

VirusTotal

1. Go to http://www.virustotal.com

2. Sign up (free) and log in

3. Click your username in the far right and select My API Key

4. Copy and paste the key in the grey box into the Settings -> VirusTotal section in SpiderFoot

IBM X-Force Exchange

1. Go to https://exchange.xforce.ibmcloud.com/new

5 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

2. Create an IBM ID (free) and log in

3. Go to your account settings

4. Click API Access

5. Generate the API key and password (you need both)

6. Copy and paste the key and password into the Settings -> X-Force section in SpiderFoot

MalwarePatrol

1. Go to http://www.malwarepatrol.net

2. Create an account (free) and log in

3. Click Open Source and scroll down to the bottom

4. Click the Free link in the subscription pricing table

5. Click the free block lists link

6. You will receive a receipt ID

7. Copy and paste the receipt ID into the Settings -> MalwarePatrol section in SpiderFoot

BotScout

1. Go to http://www.botscout.com

2. Create an account (free) and log in

3. Under Account Info, your API key will be there

6 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

4. Copy and paste the API key into the Settings -> BotScout section in SpiderFoot

Using SpiderFoot

Running a Scan

When you run SpiderFoot for the first time, there is no historical data, so you should be presented
with a screen like the following:

To initiate a scan, click on the New Scan button in the top menu bar. You will then need to define
a name for your scan (these are non-unique) and a target (also non-unique):

7 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

You can then define how you would like to run the scan - either by use case (the tab selected by
default), by data required or by module.

Module-based scanning is for more advanced users who are familiar with the behavior and data
provided by different modules, and want more control over the scan:

Beware though, there is no dependency checking when scanning by module, only for scanning by
required data. This means that if you select a module that depends on event types only provided
by other modules, but those modules are not selected, you will get no results.

Scan Results

From the moment you click Run Scan, you will be taken to a screen for monitoring your scan in
near real time:

8 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

That screen is made up of a graph showing a break down of the data obtained so far plus log
messages generated by SpiderFoot and its modules.

The bars of the graph are clickable, taking you to the result table for that particular data type.

Browsing Results

By clicking on the Browse button for a scan, you can browse the data by type:

9 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

This data is exportable and searchable. Click the Search box to get a pop-up explaining how to
perform searches.

By clicking on one of the data types, you will be presented with the actual data:

The fields displayed are explained as follows:

Checkbox field: Use this to set/unset fields as false positive. Once at least one is checked, click
the orange False Positive button above to set/unset the record.

Data Element: The data the module was able to obtain about your target.

Source Data Element: The data the module received as the basis for its data colletion. In the

10 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

example above, the sfp_portscan_tcp module received an event about an open port, and used
that to obtain the banner on that port.

Source Module: The module that identified this data.

Identified: When the data was identified by the module.

You can click the black icons to modify how this data is represented. For instance you can get a
unique data representation by clicking the Unique Data View icon:

Setting False Positives

Version 2.6.0 introduced the ability to set data records as false positive. As indicated in the
previous section, use the checkbox and the orange button to set/unset records as false positive:

Once you have set records as false positive, you will see an indicator next to those records, and
have the ability to filter them from view, as shown below:

11 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

NOTE: Records can only be set to false positive once a scan has finished running. This is because
setting a record to false positive also results in all child data elements being set to false positive.
This obviously cannot be done if the scan is still running and can thus lead to an inconsistent state
in the back-end. The UI will prevent you from doing so.

The result of a record being set to false positive, aside from the indicator in the data table view and
exports, is that such data will not be shown in the node graphs.

Searching Results

Results can be searched either at the whole scan level, or within individual data types. The scope
of the search is determined by the screen you are on at the time.

12 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

As indicated by the pop-up box when selecting the search field, you can search as follows:

Exact value: Non-wildcard searching for a specific value. For example, search for 404 within
the HTTP Status Code section to see all pages that were not found.

Pattern matching: Search for simple wildcards to find patterns. For example, search for *:22
within the Open TCP Port section to see all instances of port 22 open.

Regular expression searches: Encapsulate your string in / to search by regular expression.


For example, search for /\d+.\d+.\d+.\d+/ to find anything looking like an IP address in
your scan results.

Managing Scans

When you have some historical scan data accumulated, you can use the list available on the Scans
section to manage them:

You can filter the scans shown by altering the Filter drop-down selection. Except for the green
refresh icon, all icons on the right will all apply to whichever scans you have checked the
checkboxes for.

Tor Integration

Refer to this post for more information.

Up to table of contents

13 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Modules

Overview

SpiderFoot has all data collection modularised. When a module discovers a piece of data, that data
is transmitted to all other modules that are interested in that data type for processing. Those
modules will then act on that piece of data to identify new data, and in turn generate new events
for other modules which may be interested, and so on.

For example, sfp_dns may identify an IP address associated with your target, notifying all
interested modules. One of those interested modules would be the sfp_ir module, which will
take that IP address and identify the netblock it is a part of, the BGP ASN and so on.

This might be best illustrated by looking at module code. For example, the sfp_names module
looks for TARGET_WEB_CONTENT and EMAILADDR events for identifying human names:

# What events is this module interested in for input


# * = be notified about all events.
def watchedEvents(self):
return ["TARGET_WEB_CONTENT", "EMAILADDR"]

# What events this module produces


# This is to support the end user in selecting modules based on
events
# produced.
def producedEvents(self):
return ["HUMAN_NAME"]

Meanwhile, as each event is generated to a module, it is also recorded in the SpiderFoot database
for reporting and viewing in the UI.

Module List

The below table is an up-to-date list of all SpiderFoot modules and a short summary of their
capabilities.

Module Module Name Description

14 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Module Module Name Description


Look for possible associated accounts on nearly 200
sfp_accounts.py Accounts
websites like Ebay, Slashdot, reddit, etc.
sfp_adblock.py AdBlock Check Check if linked pages would be blocked by AdBlock Plus.
Some light Bing scraping to identify sub-domains and
sfp_bingsearch.py Bing
links.
Query various blacklist database for open relays, open
sfp_blacklist.py Blacklist
proxies, vulnerable servers, etc.
Searches botscout.coms database of spam-bot IPs and
sfp_botscout.py BotScout
e-mail addresses.
Identify associated public code repositories (Github only
sfp_coderepo.py Code Repos
for now).
sfp_cookie.py Cookies Extract Cookies from HTTP headers.
Identify whether other domains are associated (Affiliates)
sfp_crossref.py Cross-Reference
of the target.
Search Tor Onion City search engine for mentions of the
sfp_darksearch.py Darknet
target domain.
Defacement Check if a hostname/domain appears on the zone-h.org
sfp_defaced.py
Check special defacements RSS feed/
Performs a number of DNS checks to obtain
sfp_dns.py DNS
Sub-domains/Hostnames, IP Addresses and Affiliates.
Query DuckDuckGos API for descriptive information
sfp_duckduckgo.py DuckDuckGo
about your target.
sfp_email.py E-Mail Identify e-mail addresses in any obtained data.
Identify common error messages in content like SQL
sfp_errors.py Errors
errors, etc.
sfp_filemeta.py File Metadata Extracts meta data from documents and images.
sfp_geoip.py GeoIP Identifies the physical location of IP addresses identified.
Some light Google scraping to identify sub-domains and
sfp_googlesearch.py Google
links.
Identifies historic versions of interesting files/pages from
sfp_historic.py Historic Files
the Wayback Machine.
Honeypot
sfp_honeypot.py Query the projecthoneypot.org database for entries.
Checker

15 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Module Module Name Description


Hosting Find out if any IP addresses identified fall within known
sfp_hosting.py
Providers 3rd party hosting ranges, e.g. Amazon, Azure, etc.
Identifies potential files of interest, e.g. office documents,
sfp_intfiles.py Interesting Files
zip files.
Internet Queries Internet Registries to identify netblocks and other
sfp_ir.py
Registries info.
sfp_junkfiles.py Junk Files Looks for old/temporary and other similar files.
Check if a website, IP or ASN is considered malicious by
sfp_malcheck.py Malicious Check various sources. Includes TOR exit nodes and open
proxies.
Searches malwarepatrol.nets database of malicious
sfp_malwarepatrol.py MalwarePatrol
URLs/IPs.
sfp_names.py Name Extractor Attempt to identify human names in fetched content.
Obtain information about web pages (do they take
sfp_pageinfo.py Page Info
passwords, do they contain forms,
PasteBin, Pastie and Notepad.cc scraping (via Google) to
sfp_pastes.py Pastes
identify related content.
PGP Key
sfp_pgp.py Look up e-mail addresses in PGP public key servers.
Look-up
sfp_phone.py Phone Numbers Identify phone numbers in scraped webpages.
Port Scanner - Scans for commonly open TCP ports on Internet-facing
sfp_portscan_tcp.py
TCP systems.
sfp_pwned.py Pwned Password Check Have I Been Pwned? for hacked accounts identified.
S3 Bucket
sfp_s3bucket.py Search for potential S3 buckets associated with the target.
Finder
Search Bing and/or Robtex.com for hosts sharing the same
sfp_sharedip.py Shared IP
IP.
Obtain information from SHODAN about identified IP
sfp_shodan.py SHODAN
addresses.
Social Media Identify the social media profiles for human names
sfp_socialprofiles.py
Profiles identified.
Identify presence on social media networks such as
sfp_social.py Social Networks
LinkedIn, Twitter and others.
sfp_spider.py Spider Spidering of web-pages to extract content for searching.

16 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Module Module Name Description


Gather information about SSL certificates used by the
sfp_sslcert.py SSL
targets HTTPS sites.
Stores scan results into the back-end SpiderFoot database.
sfp__stor_db.py Storage
You will need this.
Obtain non-standard HTTP headers returned by web
sfp_strangeheaders.py Strange Headers
servers.
sfp_template.py Name Description
Search all Internet TLDs for domains with the same name
sfp_tldsearch.py TLD Search
as the target (this can be very slow.)
Obtain information from VirusTotal about identified IP
sfp_virustotal.py VirusTotal
addresses.
Check external vulnerability scanning services
sfp_vuln.py Vulnerable (XSSposed.org, punkspider.org) to see if the target is
listed.
Identify the usage of popular web frameworks like jQuery,
sfp_webframework.py Web Framework
YUI and others.
Obtain web server banners to identify versions of web
sfp_websvr.py Web Server
servers being used.
Perform a WHOIS look-up on domain names and owned
sfp_whois.py Whois
netblocks.
XForce
sfp_xforce.py Obtain information from IBM X-Force Exchange
Exchange
Some light Yahoo scraping to identify sub-domains and
sfp_yahoosearch.py Yahoo
links.

Data Elements

As mentioned above, SpiderFoot works on an event-driven module, whereby each module


generates events about data elements which other modules listen to and consume.

The data elements are one of the following types:

entities like IP addresses, Internet names (hostnames, sub-domains, domains),

17 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

sub-entities like port numbers, URLs and software installed,

descriptors of those entities (malicious, physical location information, ) or

data which is mostly unstructured data (web page content, port banners, raw DNS records,
)

Here are all the available data elements built into SpiderFoot:

Element Data
Element ID Element Name
Type
Account on External
ACCOUNT_EXTERNAL_OWNED ENTITY
Site
Hacked Account on
ACCOUNT_EXTERNAL_OWNED_COMPROMISED DESCRIPTOR
External Site
User Account on
ACCOUNT_EXTERNAL_USER_SHARED ENTITY
External Site
Hacked User
ACCOUNT_EXTERNAL_USER_SHARED_COMPROMISED Account on External DESCRIPTOR
Site
Affiliate - Internet
AFFILIATE_INTERNET_NAME ENTITY
Name
AFFILIATE_IPADDR Affiliate - IP Address ENTITY
Affiliate - Web
AFFILIATE_WEB_CONTENT DATA
Content
Affiliate Description
AFFILIATE_DESCRIPTION_CATEGORY DESCRIPTOR
- Category
Affiliate Description
AFFILIATE_DESCRIPTION_ABSTRACT DESCRIPTOR
- Abstract
AMAZON_S3_BUCKET Amazon S3 Bucket ENTITY
BGP_AS_OWNER BGP AS Ownership ENTITY
BGP AS
BGP_AS_MEMBER ENTITY
Membership
BGP_AS_PEER BGP AS Peer ENTITY

18 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Element Data
Element ID Element Name
Type
Blacklisted IP
BLACKLISTED_IPADDR DESCRIPTOR
Address
Blacklisted Affiliate
BLACKLISTED_AFFILIATE_IPADDR DESCRIPTOR
IP Address
Blacklisted IP on
BLACKLISTED_SUBNET DESCRIPTOR
Same Subnet
Blacklisted IP on
BLACKLISTED_NETBLOCK DESCRIPTOR
Owned Netblock
CO_HOSTED_SITE Co-Hosted Site ENTITY
Darknet Mention
DARKNET_MENTION_URL DESCRIPTOR
URL
Darknet Mention
DARKNET_MENTION_CONTENT DATA
Web Content
DEFACED_INTERNET_NAME Defaced DESCRIPTOR
DEFACED_IPADDR Defaced IP Address DESCRIPTOR
DEFACED_AFFILIATE_INTERNET_NAME Defaced Affiliate DESCRIPTOR
Defaced Co-Hosted
DEFACED_COHOST DESCRIPTOR
Site
Defaced Affiliate IP
DEFACED_AFFILIATE_IPADDR DESCRIPTOR
Address
Description -
DESCRIPTION_CATEGORY DESCRIPTOR
Category
Description -
DESCRIPTION_ABSTRACT DESCRIPTOR
Abstract
DEVICE_TYPE Device Type DESCRIPTOR
DNS_TEXT DNS TXT Record DATA
DOMAIN_NAME Domain Name ENTITY
DOMAIN_REGISTRAR Domain Registrar ENTITY
DOMAIN_WHOIS Domain Whois DATA
EMAILADDR Email Address ENTITY
Hacked Email
EMAILADDR_COMPROMISED DESCRIPTOR
Address
ERROR_MESSAGE Error Message DATA

19 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Element Data
Element ID Element Name
Type
GEOINFO Physical Location DESCRIPTOR
HTTP_CODE HTTP Status Code DATA
HUMAN_NAME Human Name ENTITY
INTERESTING_FILE Interesting File DESCRIPTOR
Historic Interesting
INTERESTING_FILE_HISTORIC DESCRIPTOR
File
JUNK_FILE Junk File DESCRIPTOR
INTERNET_NAME Internet Name ENTITY
IP_ADDRESS IP Address ENTITY
IPV6_ADDRESS IPv6 Address ENTITY
Linked URL -
LINKED_URL_INTERNAL SUBENTITY
Internal
Linked URL -
LINKED_URL_EXTERNAL SUBENTITY
External
MALICIOUS_ASN Malicious AS DESCRIPTOR
Malicious IP
MALICIOUS_IPADDR DESCRIPTOR
Address
Malicious
MALICIOUS_COHOST DESCRIPTOR
Co-Hosted Site
Malicious E-mail
MALICIOUS_EMAILADDR DESCRIPTOR
Address
Malicious Internet
MALICIOUS_INTERNET_NAME DESCRIPTOR
Name
MALICIOUS_AFFILIATE_INTERNET_NAME Malicious Affiliate DESCRIPTOR
Malicious Affiliate
MALICIOUS_AFFILIATE_IPADDR DESCRIPTOR
IP Address
Malicious IP on
MALICIOUS_NETBLOCK DESCRIPTOR
Owned Netblock
Malicious IP on
MALICIOUS_SUBNET DESCRIPTOR
Same Subnet
NETBLOCK_OWNER Netblock Ownership ENTITY
Netblock
NETBLOCK_MEMBER ENTITY
Membership

20 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Element Data
Element ID Element Name
Type
NETBLOCK_WHOIS Netblock Whois DATA
OPERATING_SYSTEM Operating System DESCRIPTOR
PASTESITE_CONTENT Paste Site Content DATA
PHONE_NUMBER Phone Number ENTITY
PGP_KEY PGP Public Key DATA
Name Server (DNS
PROVIDER_DNS ENTITY
NS Records)
Externally Hosted
PROVIDER_JAVASCRIPT ENTITY
Javascript
Email Gateway
PROVIDER_MAIL ENTITY
(DNS MX Records)
PROVIDER_HOSTING Hosting Provider ENTITY
Public Code
PUBLIC_CODE_REPO ENTITY
Repository
RAW_RIR_DATA Raw Data from RIRs DATA
RAW_DNS_RECORDS Raw DNS Records DATA
RAW_FILE_META_DATA Raw File Meta Data DATA
Search Engines Web
SEARCH_ENGINE_WEB_CONTENT DATA
Content
Social Media
SOCIAL_MEDIA ENTITY
Presence
SIMILARDOMAIN Similar Domain ENTITY
SOFTWARE_USED Software Used SUBENTITY
SSL Certificate -
SSL_CERTIFICATE_RAW DATA
Raw Data
SSL Certificate -
SSL_CERTIFICATE_ISSUED ENTITY
Issued to
SSL Certificate -
SSL_CERTIFICATE_ISSUER ENTITY
Issued by
SSL Certificate Host
SSL_CERTIFICATE_MISMATCH DESCRIPTOR
Mismatch
SSL Certificate
SSL_CERTIFICATE_EXPIRED DESCRIPTOR
Expired

21 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Element Data
Element ID Element Name
Type
SSL Certificate
SSL_CERTIFICATE_EXPIRING DESCRIPTOR
Expiring
TARGET_WEB_CONTENT Web Content DATA
TARGET_WEB_COOKIE Cookies DATA
TCP_PORT_OPEN Open TCP Port SUBENTITY
Open TCP Port
TCP_PORT_OPEN_BANNER DATA
Banner
UDP_PORT_OPEN Open UDP Port SUBENTITY
Open UDP Port
UDP_PORT_OPEN_INFO DATA
Information
URL (AdBlocked
URL_ADBLOCKED_EXTERNAL DESCRIPTOR
External)
URL (AdBlocked
URL_ADBLOCKED_INTERNAL DESCRIPTOR
Internal)
URL_FORM URL (Form) DESCRIPTOR
URL_FLASH URL (Uses Flash) DESCRIPTOR
URL (Uses
URL_JAVASCRIPT DESCRIPTOR
Javascript)
URL (Uses a Web
URL_WEB_FRAMEWORK DESCRIPTOR
Framework)
URL (Uses Java
URL_JAVA_APPLET DESCRIPTOR
Applet)
URL_STATIC URL (Purely Static) DESCRIPTOR
URL (Accepts
URL_PASSWORD DESCRIPTOR
Passwords)
URL (Accepts
URL_UPLOAD DESCRIPTOR
Uploads)
Historic URL
URL_FORM_HISTORIC DESCRIPTOR
(Form)
Historic URL (Uses
URL_FLASH_HISTORIC DESCRIPTOR
Flash)
Historic URL (Uses
URL_JAVASCRIPT_HISTORIC DESCRIPTOR
Javascript)

22 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

Element Data
Element ID Element Name
Type
Historic URL (Uses
URL_WEB_FRAMEWORK_HISTORIC DESCRIPTOR
a Web Framework)
Historic URL (Uses
URL_JAVA_APPLET_HISTORIC DESCRIPTOR
Java Applet)
Historic URL
URL_STATIC_HISTORIC DESCRIPTOR
(Purely Static)
Historic URL
URL_PASSWORD_HISTORIC DESCRIPTOR
(Accepts Passwords)
Historic URL
URL_UPLOAD_HISTORIC DESCRIPTOR
(Accepts Uploads)
USERNAME Username ENTITY
Vulnerability in
VULNERABILITY DESCRIPTOR
Public Domain
WEBSERVER_BANNER Web Server DATA
WEBSERVER_HTTPHEADERS HTTP Headers DATA
Non-Standard HTTP
WEBSERVER_STRANGEHEADER DATA
Header
WEBSERVER_TECHNOLOGY Web Technology DESCRIPTOR

Writing a Module

To write a SpiderFoot module, start by looking at the sfp_template.py file which is a


skeleton module that does nothing. Use the following steps as your guide:

1. Create a copy of sfp_template.py to whatever your module will be named. Try and
make this something descriptive, i.e. not something like sfp_mymodule.py but instead
something like sfp_imageanalyser.py if you were creating a module to analyse image
content.

2. Replace XXX in the new module with the name of your module and update the descriptive
information in the header and comment within the module.

3. The comment for the class (check in sfp_template.py ) is used by SpiderFoot in the UI

23 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

to correctly categorise modules, so make it something meaningful. Look at other modules for
examples.

4. Set the events in watchedEvents() and producedEvents() accordingly, based on


the data element table in the previous section. If you are producing a new data element not
pre-existing in SpiderFoot, you must create this in the database:

~/spiderfoot-X.X.X$ sqlite3 spiderfoot.db


sqlite> INSERT INTO tbl_event_types (event, event_descr,
event_raw) VALUES ('NEW_DATA_ELEMENT_TYPE_NAME_HERE',
'Description of your New Data Element Here', 0,
'DESCRIPTOR or DATA or ENTITY or SUBENTITY');`

5. Put the logic for the module in handleEvent() . Each call to handleEvent() is
provided a SpiderFootEvent object. The most important values within this object are:

eventType : The data element ID ( IP_ADDRESS , WEBSERVER_BANNER ,


etc.)

data : The actual data, e.g. the IP address or web server banner, etc.

module : The name of the module that produced the event ( sfp_dns , etc.)

6. When it is time to generate your event, create an instance of SpiderFootEvent :

e = SpiderFootEvent("IP_ADDRESS", ipaddr, self.__name__,


event)

Note: the event passed as the last variable is the event that your module received.
This is what builds a relationship between data elements in the SpiderFoot database.

24 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

7. Notify all modules that may be interested in the event:

self.notifyListeners(e)

Up to table of contents

Database

All SpiderFoot data is stored in a SQLite database ( spiderfoot.db in your SpiderFoot


installation folder) which can be used outside of SpiderFoot for analysis of your data.

The schema is quite simple and can be viewed in the GitHub repo.

The below queries might provide some further clues:

# Total number of scans in the SpiderFoot database


sqlite> select count(*) from tbl_scan_instance;
10

# Obtain the ID for a particular scan


sqlite> select guid from tbl_scan_instance where seed_target =
'binarypool.com';
b459e339523b8d06235bd06087ae6c6017aaf4ed68dccea0b65a1999a17e460a

# Number of results per data type


sqlite> select count(*), type from tbl_scan_results where
scan_instance_id =
'b459e339523b8d06235bd06087ae6c6017aaf4ed68dccea0b65a1999a17e460a'
group by type;
5|AFFILIATE_INTERNET_NAME
2|AFFILIATE_IPADDR
1|CO_HOSTED_SITE
1|DOMAIN_NAME
1|DOMAIN_REGISTRAR
1|DOMAIN_WHOIS

25 of 26 11/09/2016 12:47 PM
SpiderFoot Documentation about:reader?url=http://www.spiderfoot.net/doc...

1|GEOINFO
28|HTTP_CODE
48|HUMAN_NAME
49|INTERNET_NAME
2|IP_ADDRESS
49|LINKED_URL_EXTERNAL
144|LINKED_URL_INTERNAL
2|PROVIDER_DNS
1|PROVIDER_MAIL
4|RAW_DNS_RECORDS
1|RAW_FILE_META_DATA
1|ROOT
14|SEARCH_ENGINE_WEB_CONTENT
1|SOFTWARE_USED
16|TARGET_WEB_CONTENT
2|TCP_PORT_OPEN
1|TCP_PORT_OPEN_BANNER
1|URL_FORM
10|URL_JAVASCRIPT
6|URL_STATIC
21|URL_WEB_FRAMEWORK
28|WEBSERVER_BANNER
28|WEBSERVER_HTTPHEADERS

Up to table of contents

26 of 26 11/09/2016 12:47 PM

Vous aimerez peut-être aussi