Helping you unlock insights through integrated text processing


Open Source Software


Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.

github icon Beagle on GitHub

Accelerated Text

Accelerated Text is a natural language generation tool which takes descriptions of your data and then produces multiple versions of those descriptions varying in wording and text structure.

github icon Accelerated Text on GitHub

Crawling Framework

Easily crawl news portals or blog sites using Storm Crawler.

github icon Crawling Framework on GitHub


Areas of Expertise

Targeted Crawling

Example: crawl 10,000,000 web pages per day and make them available for enterprise search.

Website Parsing

Example: given a list of websites of investment funds, determine the geographic make up of their exposure.

Search and Discovery

Example: index 500,000 quarterly reports, then determine what is important to rank in the top 10 for each query of interest.

Media Parsing

Example: identifying market reactions to fluctuations of commodity prices as manifested in popular media.

Corporate Document Parsing

Example: Retrieve auditor details from a repository of quarterly company reports.

Report Generation (Natural Language)

Example: automatically generate monthly employee performance reports for different stakeholders.

A Few of Our Past Projects

client logo

Venture Radar (UK)

NLP pipeline with crawler and venture capital funding event detection.

client logo

Weborama (FR)

NLP library used as part of Weborama's media monitoring package.

client logo

SaasMAX (US)

Custom company web page crawl to extract information about business activities.

client logo

ROI: Recruit (SE)

NLP pipeline with crawler, job advertisement identification and contact person recognition.

client logo

Orbit Financial Technology (UK)

NLP pipeline with crawler. Event detection related to financial instruments. Timeseries database population.

client logo

Kaunas University of Technology (LT)

Crawler, named entity recognition, text classification, clustering, deduplication, text similarity estimation and sentiment analysis.

client logo

Social Artisan (UK)

NLP pipeline with web and social media crawler, named entity recognition, sentiment analysis and article classification.

client logo

Agency for Science, Innovation and Technology (LT)

Open source word stemmer and page function identification algorithm. Research into customer care messages classification.


Our Team


Zygimantas Medelis
github icon linkedin icon


Sarunas Navickas
Data Engineer
github icon linkedin icon


Rokas Ramanauskas
Software Engineer
github icon linkedin icon

tokenmill logo 

Contact Us