Académique Documents
Professionnel Documents
Culture Documents
There was no slowing down this week. Our project was predominantly Alteryx based a
software we had never used before so this was a real uphill battle! The theme was Education and
the task was to find Education data on the Internet, use Alteryx to prepare and collate it then use
Tableau to create a Viz.
Fortunately the almighty Alteryx Ace Chris Love provided us with the ammunition to combat the
challenge by teaching us about connecting to data, dataset manipulation, data parsing, data
blending, web scraping and where to look for help. Always impressed by experts in a subject
who are able to pass on their knowledge so well!
As usual, Andy had lined up another prominent figure to speak at the Data School; yes we are
becoming a spoiled bunch! This week Tableau CMO, Elissa Fink talked to us about how Tableau
is marketed. Elissa was very informative, encouraging and simply magnificent. The more I hear
about how Tableau operates, the more I fall in love with it. Tableaus success has been built on
putting us customers and the community first and at the heart of everything they do. And no
doubt, they use Tableau themselves in the Marketing team too!
During the week we were also fortunate to have Laszlo Zsom talk us through the differences in
table joins between Alteryx, Tableau and SQL as well as consultants Mike Lowe and Robin
Kennedy who further prepared us for the Tableau Desktop Qualified Associate exam by
delivering some more Tableau training.
Back to Alteryx. My aim for this project was to drive more insight from The Compete University
Guides university ranking tables. I wanted to import all 9 years worth of league tables by the 67
subjects along with the overall league table for every year, meaning I needed to import 9 x 68 =
612 league tables. Now imagine going to each and every one of those 612 web pages, copying
and pasting the tables, removing and column headers, making sure all the data was in the right
format and so on. The whole process would be boring and repetitive and wouldve taken days if
not weeks to complete! Luckily, I was able to do this in no time once I figured out how to use
Alteryx.
o http://www.thecompleteuniversityguide.co.uk/league-tables/rankings?
o http://www.thecompleteuniversityguide.co.uk/league-tables/rankings?y=2015
o http://www.thecompleteuniversityguide.co.uk/league-tables/rankings?
s=Accounting+%26+Finance&y=2015
My first task therefore was to generate all 612 URLs using the first part of the URL, which they
all had in common, and XML elements for all 67 subjects. I got the XML elements by right
clicking on the webpage that displayed all the hyperlinks to the subject league tables and clicking
on view source code.
The rest was straightforward. I used Tableau to create my Viz. Click here to interact with it on
Tableau Public. It allows you to see the number of universities you specify ranked and on a map
by year, subject and one of four indicators, explained here on their website.
Quite astonished I was able to do this! Previously thought you had to be a hardcore programmer
to do things like web scraping! Excited by the possibilities of using Alteryx in combination with
Tableau, especially since we accomplished so much in so little time using Alteryx!
You are here: Tool Palette > All Tools > Behavior Detail Fields
Configuration Properties
1 Choose a Dataset: Select the Dataset to use. Each dataset has its own Profiles and Profile
Sets that are specific to the selected clustering system. These datasets require a current
subscription and license. Please contact your Alteryx account representative for more
information regarding compatible datasets.
o Note: For best results, keep your datasets consistent with each Behavior tool.
Choosing "Most Recent Vintage" rather than a specific dataset ensures the most
current dataset is used and won't require updating your workflow. You can easily
specify the dataset in multiple tools at once through Workflow Dependencies to
ensure consistency throughout your workflow.
o Note: You can specify the default dataset from User Settings. Go to Tools --> User
Settings and click on the Dataset Defaults tab.
1 Analyze: Select the data field that represents the Profile to use for analysis.
1 Using (Optional): Select the data field that represents the Profile to use to compare the
Analyze Profile against for the fields selected below.
1 Show By: Record level of the data. Choices include Cluster or Group. Clusters are rolled
into Groups.
1 Demographic: Classification level the Profile was built with. Choices include Auto-
Detect, Household, Person, or Adult, but will vary by vendor.
1 Select Output Fields: Use the checkboxes to select the desired output fields. The buttons
All and Clear can help with multiple selections.
o Use Long Names: When checked, will return field names as they appear below.
When left unchecked, shorter less descriptive field names will be used. The table
below displays available fields for output.
Analyze - Average
A_AVG_VOL
Volumetric Value
Analyze - Base
A_BASE
Count
Analyze -
A_PEN
Penetration
Analyze - Total
A_VOL
Volumetric Value
Analyze Scaled to
Country - Base A_BASE_US
Count
Analyze Scaled to
A_USERS_US
Country - Count
Cluster -
DESC
Description
Cluster - Global
UG
Mosaic Group
Cluster - Mosaic
MF
Group
Cluster -
CLUSTER_PER
Penetration
Cluster - Volume
VOL_PEN_IDX
Index
Cluster - Volume
VOL_PEN
Penetration
Demographic Demographic
Market Potential -
MKPOT_PER
% Count
Market Potential -
MKPOT_VOL_PER
% Volumetric
Market Potential -
MKPOT
Count
Market Potential -
MKPOT_VOL
Volumetric
Using - % Base
U_BASE_PER
Count
Using - Average
U_AVG_VOL
Volumetric Value
Using - Total
U_VOL
Volumetric Value
Using Scaled to
Country - Base U_BASE_US
Count
Using Scaled to
U_USERS_US
Country - Count
Click Apply to have the configurations accepted.
Note: For information regarding Input, Output, Annotation and Error Properties, see Tool
Properties.
Related Topics
Alteryx, Inc. * 230 Commerce * Suite 250 * Irvine, CA * 96202-1338 * Telephone: (888) 255-
1207 *www.alteryx.com *
The two inputs represent the actual and expected values in your data. These data streams are
passed through a Record ID tool to keep positional integrity and then passed on to the Transpose
tool to create two columns. The first column contains the field names and the second column
shows the values within each field. This data is then passed on to a join, matching on Record ID
and the Name of the field, in order to compare each value. Lastly, if the data does not match from
expected to actual, a custom message will appear in the results messages alerting the user where
the mismatch happened within the dataset. The image below shows the error message produced
if values differ across datasets.