Académique Documents
Professionnel Documents
Culture Documents
2 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Introduction
The internet is big. Really big.
The surface web—those pages indexed by search engines and accessible to the public—contains 4.5 Billion Pages
at least 4.5 billion pages.1
Then there’s the deep web, or invisible web—pages that aren’t indexed by search engines,
such as content accessed via forms or password credentials, and pages that aren’t linked or
registered with search engines. No one knows how big the invisible web really is, but some
experts estimate that it’s a whopping 500 times the size of the surface web.
When your business depends on collecting and interpreting web data, the glut of available
information is both the challenge and the opportunity. You can’t afford to manually sift through
even a tiny fraction of those billions of web pages and portals, logging in and out or hoping
your homegrown code or web scraper won’t break on a dynamic web site.
But what if you could find a way to efficiently and automatically collect accurate, comprehensive
web research that’s delivered in near-real time to power your most critical business decisions?
3 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Automated Web Data Extraction is
Vital for World-Class Businesses
Global businesses are in a continual race to evolve. This is especially true for companies that are in the business
of information—whether you gather and analyze information to advise or sell to others, use information in-house
to maintain a competitive edge, or leverage information in an entirely new and disruptive business model.
Yet many companies rely on manual research, incomplete information gathered from web scraping tools, third-
party data and home-grown coding solutions to power this critical piece of their business—the equivalent of
pedaling a bike down the online superhighway while others are whizzing past in race cars.
“The goal is to turn data into information, and information into insight.”
– Carly Fiorina, former CEO of Hewlett-Packard
4 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
The Rise of Swivel Chair
Automation
Complete automation of web data collection has been out of reach
for many organizations due to limited technology options. Manual
data collection and cleanup is a tremendous burden that requires
human workers to act as the conduit between several systems,
moving between websites, portals and applications to key, re-key,
copy and paste information.
5 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
8 Drawbacks of Swivel Chair Automation
6 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
8 Drawbacks of Swivel Chair Automation
7 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
TRADITIONAL OPTIONS,
INCOMPLETE RESULTS
8 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Closing the Web Data Extraction Gap: 3 Traditional Options
Let’s look at the three options traditionally used by organizations that rely on web data to power their businesses:
9 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
The Problem with Tradition
Tradition, a.k.a. “the way we’ve always done it,” is a wonderful way to celebrate special holidays and customs,
but this approach often falls short of delivering an ideal outcome in the rapidly-changing business world.
10 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
THE MISSING PIECE:
WEB DATA EXTRACTION
11 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
What is Web Data Extraction?
Web data extraction is one use case of Kofax Kapow™ Robotic Process Automation (RPA), which uses
intelligent software robots to automate the collection of vast amounts of data like market intelligence,
financial data, news information, public records and court documents, competitive pricing and many other
diverse sources of data on the public web and online web portals.
Not only can you deploy robots in a matter of days or weeks, not months, with no coding required, but
those robots will immediately deliver accurate, timely, high-value content from the web with near-real-time
monitoring of large volumes of information and precise web data extraction.
In short, web data extraction solves problems that were previously unsolvable.
12 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Versatility Meets Utility:
How Web Data Extraction is Used
Web Data Extraction replaces traditional methods of data collection and is ideal for:
Research and consultancy firms that extract and analyze large amounts of information from the web
Examples: Investment research; government research
13 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
REAL WORLD RESULTS:
WEB DATA EXTRACTION
SUCCESS STORIES
14 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Global Research and Consultancy Company Frees
Analysts from Time-Consuming Manual Data Collection
Bonus outcome:
The Solution Exponentially less time spent on data collection frees
analysts to provide higher-quality services to clients
The company deployed Kofax Kapow to automate the and make smarter strategic decisions
collection and integration of data, freeing analysts to
with profitable results.
focus on data analysis and research work.
Read the full story >
15 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Global Financial Services Company Delivers Faster,
Sharper Investment Insights to Clients
Bonus outcome:
The Solution Timely, holistic insights empower clients to make the
most of new opportunities and run more competitive
The company deployed Kofax Kapow to automate and profitable businesses.
and streamline the extraction, transformation and
delivery of web-based content in both structured and Read the full story >
unstructured formats.
16 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Farner Consulting AG Create the First Fully Automated
Solution for Monitoring Political Issues
17 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Spotcap Revolutionizes Lending with Flexible Financing
for Small and Medium-Sized Businesses
Spotcap knew if they could make the loan process incredibly fast and efficient, Bonus outcome:
they could revolutionize the SMB lending landscape. Spotcap uses Kapow By making it easier for smaller companies to access the financing they
robots and supporting application program interfaces (APIs) to automatically need, Spotcap empowers small and medium businesses, the
extract thousands of data points from a wealth of sources, including customers’ “backbone of the economy,” to take their companies to new
accounting software, company registers, tax authority records, credit databases, heights and strengthen the economy as a whole. Read the full story >
e-commerce websites and more. Kapow then transforms and integrates the data
so it can be used by Spotcap’s credit assessment algorithm.
18 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Is Web Data Extraction
Right for You?
Organizations that benefit most from the automation of web data
collection and integration:
19 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
The View from Leading Analysts
Forrester
“Kofax acquired Kapow in 2013 for its data integration smarts but soon found the real jewels: a robotic
engine that drives web APIs for use cases that must gather and process data from internal and external sites.”
Celent
“Kofax Kapow™ is particularly strong in deploying web bots, serves multiple industries such as banking and
finance, logistics manufacturing, healthcare, and retail and travel, and has developed solutions for compliance
monitoring and reporting for banks’ KYC-AML operations.
— Innovation in Compliance Technology: Emerging Themes and Vendor Solutions, July 2017
Aragon Research
“Kapow robots can be implemented without complex coding or lengthy development cycles, which
dramatically expedites project deployments and speeds ROI. Instead of requiring a costly virtual desktop
infrastructure (VDI) like other RPA vendors, Kapow minimizes its VDI footprint by providing a centralized server
model that allows web and mainframe robots to execute without ever connecting to a virtual desktop.”
20 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
How Automated Web Data Extraction Transforms Your Business
Before After
21 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Four Must-Haves for Your RPA Solution
If you’ve decided to investigate robotic process automation for automating web data extraction,
consider a solution that:
Can extract and deliver data from multiple sources, including websites, web portals and web
apps, as well as internal and external applications
Automates all aspects of the data extraction and integration process, including real-time
changes on complex dynamic websites
Can securely scale without the need for complex and costly virtual desktops and browsers
“Service providers that leverage automation in their services portfolio have shown
that they can increase value to their existing customers and differentiate themselves
to new customers in a crowded marketplace. When providers expedite manually
intensive processes, they are able to broaden their offerings and grow their client base.”
2
Institute for Robotic Process Automation & Artificial Intelligence
22 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
Additional Resources
Learn more about how smart software robots can automate and scale the acquisition, transformation and delivery of
web data in your organization:
Video: Gain a Competitive Advantage with Data Integration and Robotic Process Automation
White Paper: Ten Must-Haves for Web Data Extraction and Transformation
White Paper: Integrating Data Sources is an Expensive Challenge for the Financial Services Sector
23 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction
For more information, ask for a demo of
Kapow Robotic Process Automation from Kofax.
POWER YOUR PROCESSES. Power web data extraction with the Kofax Kapow Robotic
EMPOWER YOUR CUSTOMERS. Process Automation Platform. For more information, contact us
at info@kofax.com or give us a call at +1 949.783.1333.
kofax.com/rpa
© 2018 Kofax. Kofax and the Kofax logo are trademarks of Kofax, registered in the United States and/or other countries. All other trademarks are the property of their respective owners.
24 | Scrap the Web Scraping: The Guide to Automating Web Data Extraction