We use geographical databases to find a geo-coded representation. Subtyping toponyms and clustering methods are used to disambiguate toponyms. Results are including a lot of ambiguities: - several toponyms can have the same name; - one toponym can have several names.
Description originale:
Titre original
Automatic Itinerary Reconstruction Form Text - Poster_Moncla
We use geographical databases to find a geo-coded representation. Subtyping toponyms and clustering methods are used to disambiguate toponyms. Results are including a lot of ambiguities: - several toponyms can have the same name; - one toponym can have several names.
We use geographical databases to find a geo-coded representation. Subtyping toponyms and clustering methods are used to disambiguate toponyms. Results are including a lot of ambiguities: - several toponyms can have the same name; - one toponym can have several names.
Automatic Itinerary Reconstruction from Texts Ludovic MONCLA 1,2 ludovic.moncla@univ-pau.fr Fig. 1: Block diagram of our processing chain
1 LIUPPA, Avenue de l'Universit Pau, France 2 Universidad de Zaragoza, C/ Maria de Luna, 1, Zaragoza, Spain
3 Universit Paris-Est, IGN, Laboratoire COGIT, 73 av. de Paris, 94160 Saint-Mand, France
We defined an expanded spatial named entity (ESNE) as I,G,T: spatial relations, geographical terms and toponyms. ! ! !
We also annotate motion expressions (classified verbs). ! " # $ % & ' $
! We use geographical databases to find a geo-coded representation of the extracted toponyms.
! Results are including a lot of ambiguities: - Several toponyms can have the same name; - One toponym can have several names; - Resources are not exhaustives.
! Subtyping toponyms and clustering methods are used to disambiguate toponyms. - "hamlet of Fontanettes" is not found in databases - sub-toponym "Fontanettes" obtains several results - one of them with the metadata type="hamlet" Example of result: {{Walk,.motion} to {{the refuge,.commonNoun} {south of,.indirection} {{hamlet,.subType} of {Fontanettes,.subToponym} ,.candidateToponym} ,. ESNE},.VT} "#$%&' ()*%+,' )*+*,-./ 012345 632765 8'92 : ;$/<=>/ ?@2@1 A?2?B Fig 4: Result after clustering Table 1: Toponyms found in databases We use a clustering algorithm to disambiguate toponyms ! Corpus - 30 hiking descriptions manually annotated - French, Spanish, Italian
Fig 2: Example of transducer ! Extraction of toponyms - 99.26% of toponyms are well extracted - 1.63% of wrong detection (person names, organizations,!) - 54.47% of toponyms are associated with spatial relations 7 7C? 7CA 7C3 7C4 7C@ 7C1 D$*,E.$/ FGH-.$ !IJ - 30-60% of toponyms: = 1 result - 10% of toponyms: > 20 results
Fig 3: Distribution of number of results (french) II. Toponyms resolution ! Resolution of toponyms - BDNyme (IGN), Geonames, OpenStreetMap
! Itinerary reconstruction - Propose a method of route calculation:
o Use information extracted from textual description - perception of places not reached - negation (don't go to, etc) - directionnal information (south, north, etc) o Use spatial analysis to link places
- Combine language and spatial analysis to order places and reconstruct the path
Supervisors: M. Gaio 1 , J. Nogueras-Iso 2 In partnership with S. Mustire 3 L M ? ? N L N @ @ N L N A7 L O A7 Experiments