During an Open Source Intelligence (OSINT) investigation, having the right tool for the right job is essential. The success of an OSINT investigation project might be judged by the effectiveness of your tools. The reason behind this statement is that the amount of information that needs to be collected, processed and analyzed is huge, and it is not possible for a human to process it manually.
In this post we will show you how to install, configure and use the open source tool FOCA (Fingerprinting Organizations with Collected Archives) and how to find metadata and hidden information in the documents collected from online search engines, such as Google, Bing, and DuckDuckGo.
Open-source intelligence (OSINT)
Open-source intelligence (OSINT) is data collected from publicly available sources to be used in an intelligence context. In the intelligence community, the term “open” refers to overt, publicly available sources. With the advent of instant communications and rapid information transfer, a great deal of actionable and predictive intelligence can now be obtained from public, unclassified sources. OSINT is primarily used in national security, law enforcement, cyber security and business intelligence functions and is of value to analysts who use non-sensitive intelligence in answering classified, unclassified, or proprietary intelligence requirements across the previous intelligence disciplines.
https://en.wikipedia.org/wiki/Open-source_intelligence
Metadata
Metadata is “data that provides information about other data”. In other words, it is “data about data.” Many distinct types of metadata exist, including descriptive metadata, structural metadata, administrative metadata, reference metadata and statistical metadata.
- Descriptive metadata is descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords.
- Structural metadata is metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships and other characteristics of digital materials.
- Administrative metadata is information to help manage a resource, like resource type, permissions, and when and how it was created.
- Reference metadata is information about the contents and quality of statistical data.
- Statistical metadata, also called process data, may describe processes that collect, process, or produce statistical data.
https://en.wikipedia.org/wiki/Metadata
Exchangeable image file format (Exif)
Exchangeable image file format (officially Exif, according to JEIDA/JEITA/CIPA specifications) is a standard that specifies the formats for images, sound, and ancillary tags used by digital cameras (including smartphones), scanners and other systems handling image and sound files recorded by digital cameras. The specification uses the following existing file formats with the addition of specific metadata tags: JPEG discrete cosine transform (DCT) for compressed image files, TIFF Rev. 6.0 (RGB or YCbCr) for uncompressed image files, and RIFF WAV for audio files (Linear PCM or ITU-T G.711 μ-Law PCM for uncompressed audio data, and IMA-ADPCM for compressed audio data). It is not used in JPEG 2000 or GIF. This standard consists of the Exif image file specification and the Exif audio file specification.
https://en.wikipedia.org/wiki/Exif
FOCA (Fingerprinting Organizations with Collected Archives)
FOCA is a tool used mainly to find metadata and hidden information in the documents it scans. These documents may be on web pages, and can be downloaded and analysed with FOCA. It is capable of analysing a wide variety of documents, with the most common being Microsoft Office, Open Office, or PDF files, although it also analyses Adobe InDesign or SVG files, for instance. These documents are searched for using three possible search engines: Google, Bing, and DuckDuckGo. The sum of the results from the three engines amounts to a lot of documents. It is also possible to add local files to extract the EXIF information from graphic files, and a complete analysis of the information discovered through the URL is conducted even before downloading the file.
Requirements
- Microsoft Windows (64 bits). Versions 7, 8, 8.1 and 10.
- Microsoft .NET Framework 4.7.1.
- Microsoft Visual C++ 2010 x64 or greater.
- An instance of SQL Server 2014 or greater.
https://github.com/ElevenPaths/FOCA/releases
Installation / Run
- Use a Windows 10 Virtual Machine
- Install Microsoft SQL Server Express Edition (https://www.microsoft.com/en-us/sql-server/sql-server-downloads)
- Unzip the download archive,
- Double click the FOCA.exe file.


Perform an investigation
