Vous êtes sur la page 1sur 7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

Code

2June

Search

INFRADATARESEARCHNEWSFEED

Introducing DeepText: Facebook's text understanding


engine
AhmadAbdulkader

AparnaLakshmiratan

JoyZhang

TextisaprevalentformofcommunicationonFacebook.Understandingthevariouswaystextis
usedonFacebookcanhelpusimprovepeople'sexperienceswithourproducts,whetherwe're
surfacingmoreofthecontentthatpeoplewanttoseeorfilteringoutundesirablecontentlike
spam.
Withthisgoalinmind,webuiltDeepText,adeeplearningbasedtextunderstandingenginethat
canunderstandwithnearhumanaccuracythetextualcontentofseveralthousandspostsper
second,spanningmorethan20languages.
DeepTextleveragesseveraldeepneuralnetworkarchitectures,includingconvolutionaland
recurrentneuralnets,andcanperformwordlevelandcharacterlevelbasedlearning.Weuse
FbLearnerFlowandTorchformodeltraining.Trainedmodelsareservedwithaclickofa
buttonthroughtheFBLearnerPredictorplatform,whichprovidesascalableandreliablemodel
distributioninfrastructure.FacebookengineerscaneasilybuildnewDeepTextmodelsthrough
theselfservearchitecturethatDeepTextprovides.

Why deep learning


Textunderstandingincludesmultipletasks,suchasgeneralclassificationtodeterminewhata
https://code.facebook.com/posts/181565595577955

1/7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

postisaboutbasketball,forexampleandrecognitionofentities,likethenamesofplayers,
statsfromagame,andothermeaningfulinformation.Buttogetclosertohowhumans
understandtext,weneedtoteachthecomputertounderstandthingslikeslangandwordsense
disambiguation.Asanexample,ifsomeonesays,Ilikeblackberry,doesthatmeanthefruitor
thedevice?
TextunderstandingonFacebookrequiressolvingtrickyscalingandlanguagechallengeswhere
traditionalNLPtechniquesarenoteffective.Usingdeeplearning,weareabletounderstand
textbetteracrossmultiplelanguagesanduselabeleddatamuchmoreefficientlythan
traditionalNLPtechniques.DeepTexthasbuiltonandextendedideasindeeplearningthat
wereoriginallydevelopedinpapersbyRonanCollobertandYannLeCunfromFacebookAI
Research.

Understanding more languages faster


ThecommunityonFacebookistrulyglobal,soit'simportantforDeepTexttounderstandas
manylanguagesaspossible.TraditionalNLPtechniquesrequireextensivepreprocessinglogic
builtonintricateengineeringandlanguageknowledge.Therearealsovariationswithineach
language,aspeopleuseslanganddifferentspellingstocommunicatethesameidea.Using
deeplearning,wecanreducetherelianceonlanguagedependentknowledge,asthesystem
canlearnfromtextwithnoorlittlepreprocessing.Thishelpsusspanmultiplelanguages
quickly,withminimalengineeringeffort.

Deeper understanding
IntraditionalNLPapproaches,wordsareconvertedintoaformatthatacomputeralgorithmcan
learn.ThewordbrothermightbeassignedanintegerIDsuchas4598,whilethewordbro
becomesanotherinteger,like986665.Thisrepresentationrequireseachwordtobeseenwith
exactspellingsinthetrainingdatatobeunderstood.
Withdeeplearning,wecaninsteadusewordembeddings,amathematicalconceptthat
preservesthesemanticrelationshipamongwords.So,whencalculatedproperly,wecansee
thatthewordembeddingsofbrotherandbroarecloseinspace.Thistypeofrepresentation
allowsustocapturethedeepersemanticmeaningofwords.
Usingwordembeddings,wecanalsounderstandthesamesemanticsacrossmultiple
languages,despitedifferencesinthesurfaceform.Asanexample,forEnglishandSpanish,
happybirthdayandfelizcumpleaosshouldbeveryclosetoeachotherinthecommon
embeddingspace.Bymappingwordsandphrasesintoacommonembeddingspace,DeepText
iscapableofbuildingmodelsthatarelanguageagnostic.

Labeled data scarcity


https://code.facebook.com/posts/181565595577955

2/7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

Writtenlanguage,despitethevariationsmentionedabove,hasalotofstructurethatcanbe
extractedfromunlabeledtextusingunsupervisedlearningandcapturedinembeddings.Deep
learningoffersagoodframeworktoleveragetheseembeddingsandrefinethemfurtherusing
smalllabeleddatasets.Thisisasignificantadvantageovertraditionalmethods,whichoften
requirelargeamountsofhumanlabeleddatathatareinefficienttogenerateanddifficultto
adapttonewtasks.Inmanycases,thiscombinationofunsupervisedlearningandsupervised
learningsignificantlyimprovesperformance,asitcompensatesforthescarcityoflabeleddata
sets.

Exploring DeepText on Facebook


DeepTextisalreadybeingtestedonsomeFacebookexperiences.InthecaseofMessenger,
forexample,DeepTextisusedbytheAMLConversationUnderstandingteamtogetabetter
understandingofwhensomeonemightwanttogosomewhere.It'susedforintentdetection,
whichhelpsrealizethatapersonisnotlookingforataxiwhenheorshesayssomethinglike,I
justcameoutofthetaxi,asopposedtoIneedaride.
DeepTextonMessenger
PostedbyFacebookEngineering
9,314Views

Like

Share

Save

0:18

We'realsobeginningtousehighaccuracy,multilanguageDeepTextmodelstohelppeoplefind
therighttoolsfortheirpurpose.Forexample,someonecouldwriteapostthatsays,Iwould
liketosellmyoldbikefor$200,anyoneinterested?DeepTextwouldbeabletodetectthatthe
postisaboutsellingsomething,extractthemeaningfulinformationsuchastheobjectbeing
soldanditsprice,andpromptthesellertouseexistingtoolsthatmakethesetransactions
easierthroughFacebook.
https://code.facebook.com/posts/181565595577955

3/7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

DeepTexthasthepotentialtofurtherimproveFacebookexperiencesbyunderstandingposts
bettertoextractintent,sentiment,andentities(e.g.,people,places,events),usingmixed
contentsignalsliketextandimages,andautomatingtheremovalofobjectionablecontentlike
spam.ManycelebritiesandpublicfiguresuseFacebooktostartconversationswiththepublic.
Theseconversationsoftendrawhundredsoreventhousandsofcomments.Findingthemost
relevantcommentsinmultiplelanguageswhilemaintainingcommentqualityiscurrentlya
challenge.OneadditionalchallengethatDeepTextmaybeabletoaddressissurfacingthe
mostrelevantorhighqualitycomments.

Next steps
WearecontinuingtoadvanceDeepTexttechnologyanditsapplicationsincollaborationwiththe
FacebookAIResearchgroup.Herearesomeexamples.

Better understanding people's interests


Partofpersonalizingpeople'sexperiencesonFacebookisrecommendingcontentthatis
relevanttotheirinterests.Inordertodothis,wemustbeabletomapanygiventexttoa
particulartopic,whichrequiresmassiveamountsoflabeleddata.
Whilesuchdatasetsarehardtoproducemanually,wearetestingtheabilitytogeneratelarge
datasetswithsemisupervisedlabelsusingpublicFacebookpages.It'sreasonabletoassume
thatthepostsonthesepageswillrepresentadedicatedtopicforexample,postsonthe
SteelerspagewillcontaintextabouttheSteelersfootballteam.Usingthiscontent,wetraina
generalinterestclassifierwecallPageSpace,whichusesDeepTextasitsunderlying
technology.Inturn,thiscouldfurtherimprovethetextunderstandingsystemacrossother
Facebookexperiences.

Joint understanding of textual and visual content


Oftenpeoplepostimagesorvideosandalsodescribethemusingsomerelatedtext.Inmanyof
thosecases,understandingintentrequiresunderstandingbothtextualandvisualcontent
together.Asanexample,afriendmaypostaphotoofhisorhernewbabywiththetextDay
25.Thecombinationoftheimageandtextmakesitclearthattheintenthereistosharefamily
news.WeareworkingwithFacebook'svisualcontentunderstandingteamstobuildnewdeep
learningarchitecturesthatlearnintentjointlyfromtextualandvisualinputs.

New deep neural network architectures


Wecontinuetodevelopandinvestigatenewdeepneuralnetworkarchitectures.Bidirectional
https://code.facebook.com/posts/181565595577955

4/7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

recurrentneuralnets(BRNNs)showpromisingresults,astheyaimtocapturebothcontextual
dependenciesbetweenwordsthroughrecurrenceandpositioninvariantsemanticsthrough
convolution.WehaveobservedthatBRNNsachievelowererrorratesthanregular
convolutionalorrecurrentneuralnetsforclassificationinsomecasestheerrorratesareaslow
as20percent.
Whileapplyingdeeplearningtechniquestotextunderstandingwillcontinuetoenhance
Facebookproductsandexperiences,thereverseisalsotrue.Theunstructureddataon
Facebookpresentsauniqueopportunityfortextunderstandingsystemstolearnautomatically
onlanguageasitisnaturallyusedbypeopleacrossmultiplelanguages,whichwillfurther
advancethestateoftheartinnaturallanguageprocessing.
DeepText
PostedbyFacebookEngineering
8,873Views

Like

Share

Save

1:37

Like

Share 3,570peoplelikethis.Bethe
firstofyourfriends.

More to Read

Python in production engineering

https://code.facebook.com/posts/181565595577955

5/7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

Related

Introducing FBLearner Flow: Facebook's AI backbone

Powering Facebook experiences with AI

Teaching machines to see and understand: Advances in AI research

Recommended

Introducing FBLearner Flow: Facebook's AI backbone

Introducing Facebook's new terrestrial connectivity systems Terragraph and Project ARIES

Introducing 6-pack: the first open hardware modular switch


https://code.facebook.com/posts/181565595577955

6/7

6/4/2016

IntroducingDeepText:Facebook'stextunderstandingengine|EngineeringBlog|FacebookCode

Introducing the Facebook Messenger for Android beta testing program

Want to work with us?


Jointheteam,we'rehiring!Herearesomeofourcurrentopenpositions:
EngineeringManager,BackendTools(WhatsApp)
SoftwareEngineer,Enterprise
MobileSoftwareDeveloper(WhatsApp)
SoftwareEngineer,Ads
MobileSoftwareDeveloper(WhatsApp)

Connect
FacebookEng
Liked

1.4Mlikes

Youand4otherfriendslikethis

FollowusonTwitter

Keep Updated
StayuptodateviaRSSwiththelatestopensourceprojectreleasesfromFacebook,newsfromour
Engineeringteams,andupcomingevents.
Subscribe

https://code.facebook.com/posts/181565595577955

7/7

Vous aimerez peut-être aussi