Académique Documents
Professionnel Documents
Culture Documents
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
Code
2June
Search
INFRADATARESEARCHNEWSFEED
AparnaLakshmiratan
JoyZhang
TextisaprevalentformofcommunicationonFacebook.Understandingthevariouswaystextis
usedonFacebookcanhelpusimprovepeople'sexperienceswithourproducts,whetherwe're
surfacingmoreofthecontentthatpeoplewanttoseeorfilteringoutundesirablecontentlike
spam.
Withthisgoalinmind,webuiltDeepText,adeeplearningbasedtextunderstandingenginethat
canunderstandwithnearhumanaccuracythetextualcontentofseveralthousandspostsper
second,spanningmorethan20languages.
DeepTextleveragesseveraldeepneuralnetworkarchitectures,includingconvolutionaland
recurrentneuralnets,andcanperformwordlevelandcharacterlevelbasedlearning.Weuse
FbLearnerFlowandTorchformodeltraining.Trainedmodelsareservedwithaclickofa
buttonthroughtheFBLearnerPredictorplatform,whichprovidesascalableandreliablemodel
distributioninfrastructure.FacebookengineerscaneasilybuildnewDeepTextmodelsthrough
theselfservearchitecturethatDeepTextprovides.
1/7
6/4/2016
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
postisaboutbasketball,forexampleandrecognitionofentities,likethenamesofplayers,
statsfromagame,andothermeaningfulinformation.Buttogetclosertohowhumans
understandtext,weneedtoteachthecomputertounderstandthingslikeslangandwordsense
disambiguation.Asanexample,ifsomeonesays,Ilikeblackberry,doesthatmeanthefruitor
thedevice?
TextunderstandingonFacebookrequiressolvingtrickyscalingandlanguagechallengeswhere
traditionalNLPtechniquesarenoteffective.Usingdeeplearning,weareabletounderstand
textbetteracrossmultiplelanguagesanduselabeleddatamuchmoreefficientlythan
traditionalNLPtechniques.DeepTexthasbuiltonandextendedideasindeeplearningthat
wereoriginallydevelopedinpapersbyRonanCollobertandYannLeCunfromFacebookAI
Research.
Deeper understanding
IntraditionalNLPapproaches,wordsareconvertedintoaformatthatacomputeralgorithmcan
learn.ThewordbrothermightbeassignedanintegerIDsuchas4598,whilethewordbro
becomesanotherinteger,like986665.Thisrepresentationrequireseachwordtobeseenwith
exactspellingsinthetrainingdatatobeunderstood.
Withdeeplearning,wecaninsteadusewordembeddings,amathematicalconceptthat
preservesthesemanticrelationshipamongwords.So,whencalculatedproperly,wecansee
thatthewordembeddingsofbrotherandbroarecloseinspace.Thistypeofrepresentation
allowsustocapturethedeepersemanticmeaningofwords.
Usingwordembeddings,wecanalsounderstandthesamesemanticsacrossmultiple
languages,despitedifferencesinthesurfaceform.Asanexample,forEnglishandSpanish,
happybirthdayandfelizcumpleaosshouldbeveryclosetoeachotherinthecommon
embeddingspace.Bymappingwordsandphrasesintoacommonembeddingspace,DeepText
iscapableofbuildingmodelsthatarelanguageagnostic.
2/7
6/4/2016
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
Writtenlanguage,despitethevariationsmentionedabove,hasalotofstructurethatcanbe
extractedfromunlabeledtextusingunsupervisedlearningandcapturedinembeddings.Deep
learningoffersagoodframeworktoleveragetheseembeddingsandrefinethemfurtherusing
smalllabeleddatasets.Thisisasignificantadvantageovertraditionalmethods,whichoften
requirelargeamountsofhumanlabeleddatathatareinefficienttogenerateanddifficultto
adapttonewtasks.Inmanycases,thiscombinationofunsupervisedlearningandsupervised
learningsignificantlyimprovesperformance,asitcompensatesforthescarcityoflabeleddata
sets.
Like
Share
Save
0:18
We'realsobeginningtousehighaccuracy,multilanguageDeepTextmodelstohelppeoplefind
therighttoolsfortheirpurpose.Forexample,someonecouldwriteapostthatsays,Iwould
liketosellmyoldbikefor$200,anyoneinterested?DeepTextwouldbeabletodetectthatthe
postisaboutsellingsomething,extractthemeaningfulinformationsuchastheobjectbeing
soldanditsprice,andpromptthesellertouseexistingtoolsthatmakethesetransactions
easierthroughFacebook.
https://code.facebook.com/posts/181565595577955
3/7
6/4/2016
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
DeepTexthasthepotentialtofurtherimproveFacebookexperiencesbyunderstandingposts
bettertoextractintent,sentiment,andentities(e.g.,people,places,events),usingmixed
contentsignalsliketextandimages,andautomatingtheremovalofobjectionablecontentlike
spam.ManycelebritiesandpublicfiguresuseFacebooktostartconversationswiththepublic.
Theseconversationsoftendrawhundredsoreventhousandsofcomments.Findingthemost
relevantcommentsinmultiplelanguageswhilemaintainingcommentqualityiscurrentlya
challenge.OneadditionalchallengethatDeepTextmaybeabletoaddressissurfacingthe
mostrelevantorhighqualitycomments.
Next steps
WearecontinuingtoadvanceDeepTexttechnologyanditsapplicationsincollaborationwiththe
FacebookAIResearchgroup.Herearesomeexamples.
4/7
6/4/2016
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
recurrentneuralnets(BRNNs)showpromisingresults,astheyaimtocapturebothcontextual
dependenciesbetweenwordsthroughrecurrenceandpositioninvariantsemanticsthrough
convolution.WehaveobservedthatBRNNsachievelowererrorratesthanregular
convolutionalorrecurrentneuralnetsforclassificationinsomecasestheerrorratesareaslow
as20percent.
Whileapplyingdeeplearningtechniquestotextunderstandingwillcontinuetoenhance
Facebookproductsandexperiences,thereverseisalsotrue.Theunstructureddataon
Facebookpresentsauniqueopportunityfortextunderstandingsystemstolearnautomatically
onlanguageasitisnaturallyusedbypeopleacrossmultiplelanguages,whichwillfurther
advancethestateoftheartinnaturallanguageprocessing.
DeepText
PostedbyFacebookEngineering
8,873Views
Like
Share
Save
1:37
Like
Share 3,570peoplelikethis.Bethe
firstofyourfriends.
More to Read
https://code.facebook.com/posts/181565595577955
5/7
6/4/2016
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
Related
Recommended
Introducing Facebook's new terrestrial connectivity systems Terragraph and Project ARIES
6/7
6/4/2016
IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode
Connect
FacebookEng
Liked
1.4Mlikes
Youand4otherfriendslikethis
FollowusonTwitter
Keep Updated
StayuptodateviaRSSwiththelatestopensourceprojectreleasesfromFacebook,newsfromour
Engineeringteams,andupcomingevents.
Subscribe
https://code.facebook.com/posts/181565595577955
7/7