Vous êtes sur la page 1sur 9

15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

36
search

Using Python to Scrape the Meet-


Up API
SeattleDataGuy Aug 10 Updated on Aug 12, 2019 ‧5 min read
#python #sql

We recently posted some ideas for projects you could take on, to add
to your resume and help you learn more about programming.
One of those projects involved scraping the Meet-up and Eventbrite
APIs to create an aggregate site of events.
This is a great project and it opens up the opportunity to take on
several concepts. You could use this idea to make an alerting system
— the user inputs their API keys to track local events they have an
interest in. You could develop a site to predict which live acts will
become highly popular, before they get there, by tracking metrics over
time.

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 1/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

Honestly, 36
search the APIs give a decent amount of data, even to the point of

giving you member names (and supposedly emails too, if the member
is authenticated). It’s a lot of fun — you can use this data for the basis
of your own site!

To Start:

To start this project, break down the basic pieces you will need to build
the backend. More than likely you will need:

An API “Scraper”
Database interface
Operational database
Data-warehouse (optional)
ORM

To start out you will need to develop a scraper class. This class should
be agnostic of the specific API call you’re making. That way, you can
avoid having to make a specific class or script for each call. In
addition, when the API changes, you won’t have to spend as much
time going through every script to update every variable.

Instead, you’ll only need to go through and update the configurations.


That being said, we don’t recommend trying to develop a perfectly
abstract class right away. Trying to build a perfectly abstracted class
that has no hard-coded variables, from the beginning, can be difficult.
If anything goes wrong or doesn’t work then it is harder to debug
because of the layers of abstraction.

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 2/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

We’ll start by trying to develop pieces that work. 36


search

The first decision you need to make is where the scraper will be
putting the data. We’re creating a folder structure in which each day
has its own folder.

You could use a general folder on a server, S3, or a similar raw file
structure. These offer the ability to easily store the raw data that we’re
storing in a JSON file. Other data storage methods, like csv and tsv,
are thrown off by the way the description data is formatted.
Let’s took a look at the basic script. Think about how you could start
better configuring and refactoring the codebase to be better developed.

import requests
import time
import json
import sys
import codecs
import csv

class MeetUpScraper:

api_call_type=""
config_file="meet_up_config.json"

def get_results(self,params,config_data):
request=requests.get(config_data[self.api_call_type]['api_endpoi
data=request.json()
return data

def main(self,p_config_file):
cities=[("Seattle","WA")]
api_key="APIKEY"
20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 3/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

for (city,state) in cities: 36


search
per_page=200
results_we_got = per_page
offset=0
while(results_we_got==per_page):

response = self.get_results(
{"sign":"true","country":"US","city":city,"state":state,
,p_config_file
)
time.sleep(1)
offset+=1
data={}
results_we_got = response['meta']['count']
data = response['results']
export_file= open("data/data_"+self.api_call_type+"_"+st
json.dump(data,export_file)
export_file.close()

def __init__(self,api_call_type):
self.api_call_type=api_call_type
config=open(self.config_file)
config_data=json.load(config)
self.main(config_data)

MeetUpScraper("get_event") #for testing

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 4/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

36
search

One place right off the bat is the API key. While you’re testing it’s
easy to hard-code your own API key. But if your eventual goal is to
allow multiple users to gain access to this data then you will want their
API keys set up.

The next portion you will want to update is the hardcoded references
to data you are pulling. This hard-coding limits the code to only work
with one API call. One example of this is how we pull the different
endpoints and reference what fields you would like to pull from what
is returned.
For this example, we are just dumping everything in JSON. Perhaps
you want to be very choosy — in that case, you might want to
configure what columns are attached to each field.

For example:

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 5/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

36
search
{
"get_group":
{
"api_endpoint":"http://api.meetup.com/2/groups"
},
"get_event":
{
"row_list":
["country", "city", "created", "rating", "description", "rating"
"insert_script":

"INSERT into raw_meet_up_3 (country, city, created, rating, descri


"api_endpoint":"http://api.meetup.com/2/open_events"
}
}

This allows you to create a scraper that is agnostic of which API event
you will be using. It puts the settings outside the code, which can be
easier to maintain.

For example, what happens if Meet-up changes the API endpoints or


column names? Well, instead of having to go into 10 different code
files you can just change the config file.

The next stage is creating a database and ETL, to load and store all the
data, and a system that automatically parses the data from the JSON
files into an operational style database. This database can be used to
help track events that you might be interested in. In addition, creating a
data warehouse could help track metrics.

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 6/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

Perhaps 36
search you’re interested in the rate at which events have people

RSVP, or how quickly events get sold out.

Based on that you could analyze what types of descriptions or groups


that quickly run out of slots.

Personally, there is a lot of fun analysis you could take on.


Over the next few weeks and months, we’ll be working to continue
developing this project. This includes building a database, maybe
doing some analysis, and more!

We hope you enjoyed this piece!

If you enjoyed this video about software engineering then consider


these videos as well!
The Advantages Healthcare Providers Have In Healthcare Analytics
142 Resources for Mastering Coding Interviews
Learning Data Science: Our Top 25 Data Science Courses
The Best And Only Python Tutorial You Will Ever Need To Watch
Dynamically Bulk Inserting CSV Data Into A SQL Server
4 Must Have Skills For Data Scientists
What Is A Data Scientist

SeattleDataGuy + FOLLOW
Data Engineer | Consultant | Data Scientist
@seattledataguy SeattleDataGuy www.theseattledataguy.com

Add to the discussion

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 7/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

36
search

PREVIEW SUBMIT

code of conduct - report abuse

Classic DEV Post from Nov 14 '18

Learn about Dyslexia for the Web with me!


Lindsey Kopacz

I learned that last month was Dyslexia Awareness Month. However, I’ve found very
few places that talk about web accessibility and dyslexia.
91 29

Another Post You Might Like

SQL 101: Concepts from A to Z


Helen Anderson

Terms and concepts if you need a refresher or have a junior analyst in your life that
needs something to refer back to.
449 11

Another Post You Might Like

Resources for Beginner Data Analysts


Helen Anderson

Everything you need to get started with SQL and becoming a great data analyst
181 6
20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 8/9
15/08/2019 Using Python to Scrape the Meet-Up API - DEV Community 👩💻👨💻

36
search
In-line Renaming of Pandas Aggregates
Riley Molloy - Aug 14

Facebook Unoficial API + Chatterbot Bot Part 2: The Echo Bot


Takunda Madechangu - Aug 13

Facebook Unoficial API + Chatterbot part 1: Setting Environment


Takunda Madechangu - Aug 13

Develop a Django + Celery app in Kubernetes


Ramiro Berrelleza - Aug 13

Home About Privacy Policy Terms of Use Contact Code of Conduct

DEV Community copyright 2016 - 2019  🔥

20 8 20

https://dev.to/seattledataguy/using-python-to-scrape-the-meet-up-api-316p 9/9

Vous aimerez peut-être aussi