非结构化文本的地理定位

【摘要】根据前文介绍,地理信息抽取是构建地理知识图谱的核心,而对非结构文本的地理定位是地理信息抽取技术的核心。本文对非结构文本的地理定位研究现状进行了梳理。鉴于国内地理信息科学和计算机信息科学为两个学科,而地理定位属于跨学科、研究难度偏大的命题,国内比较深度或成体系的研究较少,因此本文主要针对国际相关研究情况。

【原文】

【DOI】

1. 文本中的地理引用

  • 地理引用(Geoeferences)

    • 文本中对位置的指称(形式上为文本片段)被称为地理引用,也称位置引用位置标识地理标签
  • **地理引用的案例:**地理引用的形式多样,例如,以下形式都可以定位北京大学:

    • 北京市海淀区颐和园路5号 – 北京大学的通信地址

    • 北京大学 – 北京大学的地名

    • 100871 – 北京大学的中国邮政编码

    • X8P4+Q8 – 北京大学的谷歌开放位置码

    • 39.986913,116.3036799 – 北京大学的经纬度

    • 此外,还有很多在专业领域编制的具有空间位置含义的编码,如:北京大学的不动产单元登记码、北京大学的城市管理网格码等,均可以纳入地理引用的范围。

  • 人们总是期望地理引用具有如下属性:

    • 明确性 ---- 在某个给定的参考框架内,地理引用只专指一个位置,但地名等很难做到
    • 共知性 ---- 地理引用应当在使用它们的人之间能够共享,且他们清楚其地理含义
    • 时间不变性 ---- 地理引用应尽可能在时间上保持不变,但事实上,地名、行政区划等经常变化
    • 此外,地理引用通常与隐式的粒度相关,例如:偏远村庄的通信地址显然会采用比1米精度更粗的粒度

2. 文本的地理定位

  • 地理定位(GeoReferencing)
    • 将文本中的地理引用转换为物理空间中的某个位置(如:经纬度坐标)的任务,被成为地理定位
    • 地理定位将文本片段形式的地理引用,转换为特定物理空间中的位置,使其成为计算机能够识别的坐标值或空间实体,从而赋予了其地理语义
    • 识别文本中的地理引用有助于理解该文本的地理上下文,进而区分其与文本核心主题之间的关联度
  • 地理信息抽取任务中:
    • 地理定位的起点是自然语言文本形式的文档或文本
      • 包含经度/纬度坐标、地名、地理区域(如国家)等各种形式的地理引用
    • 地理定位的终点是给出其在物理空间中的坐标值
    • 地理信息抽取追求的是自动执行该任务,这也是GIR的核心挑战
    • 目前大多数GIR聚焦在地名引用和地址引用的地理定位问题上,尽管地理引用的类型非常多
  • 地理定位的过程
    • **地理解析(GeoParsing):**识别文本中地理引用的过程,可以理解为从文本中识别出地理引用的过程;
    • **地理编码(GeoCoding):**解析地理引用的过程,可以理解为将上述地理引用解释为点、线、面等空间对象实体的过程,显然包含两个子过程
      • 获取地理引用所对应地理实体的唯一标识符
      • 根据地理实体唯一标识符获取其空间几何描述
    • 文档范围的地理定位
      • 上述工作面向文档级别的地理定位问题
      • 在其基础上,有学者研究文档中不同范围(如:段落)的地理定位问题

3. 地理解析技术

3.1 地理解析任务解释

  • 地理解析的基本任务是确定候选的地理引用。
  • (文本中地理应用类型很多,目前比较多的研究聚焦在地名上,其也被称为地名解析或地名识别)
  • 该任务与信息提取领域的命名实体识别(NER)相当,即为文本中每个单词或单词组分配一组预定义的类别或实体,类别包括:位置、人员或组织的名称、时间、日期和货币金额的数字表达式等。
  • 地理解析任务的输出是候选的地理引用列表,也是后期地理编码任务的输入。

​ 例如:上表中的文本显示了命名实体识别系统的典型输出(其中:org指组织,loc之位置)。大多数NER系统都是由分词、词性标注、实体匹配等构成的处理流水线。

  • 在NER中,以前许多工作都是提取相当国家、州、省、城市等粗粒度的实体

  • 目前更详细位置的地名识别研究也越来越广泛(Uryupina,2003;Axelrod,2003)

  • 在地名识别中必须处理和解决语义歧义的问题

    • 例如:“仙人指”指的是一个神仙的指头,还是一个地点。
  • 此类问题中同一个名称可能属于多个不同于地名类别,通常也被称为引用类歧义或地理/非地理歧义(Amitay等人,2004)。

3.2 地理解析的基本做法

  • 大多数地理解析方法将已知地名词典(Hill,2000))、组织和人员列表与机器学习技术结合起来,或者与利用周围环境要素构建的规则结合起来
  • 无论何种地理解析方法,构建全面又准确的地名辞典都是其基础工作
    • 重要性与基础性
      • 构建既地名辞典是一项基础性且具有挑战性的任务(Sehgal等人,2006年;Martins,2011年;Recchia和Louwerse,2013年)
    • 多辞典融合的发展趋势
      • 建立高质量地名辞典逐步倾向于合并多个来源地名辞典数据,对地理解析而言,这是一项非常重要的任务(Manguinhas等人,2008年;Smart等人,2010年)
    • 民间/官方融合的发展趋势
      • 自然语言中的地名用法不限于官方或行政用途,因此,地名辞典还应包含当地地名(民间地名)及其空间足迹(Keßler等人,2009年;Jones等人,2008年)

3.3 地理解析的主要方法类型

​ 三类地理解析方法(Leidner和Lieberman,2011):

  • 基于列表查找的方法
  • 基于知识或规则的方法
  • 基于机器学习的方法

3.4 基于列表查找的地理解析法

列表查找是识别地理引用的最基本方法。

  • 工作原理:
    • 从事先生成的地名、地址、邮政编码、电话号码等列表中查找已知的实体
  • 实现方法:
    • 基于字符串的匹配
  • 优点:
    • 简单、快速、独立于语言
    • 已知地名辞典列表的质量比其规模影响更大
    • 例如:Mikheev等人(1999)通过从中央情报局的世界概览中收集5000个位置,并根据MUC-7数据进行评估,位置列表查找性能可达到90-94%的准确率、75-85%的召回率。
  • 缺点:
    • 无法确定在地名辞典中没有的新实体
    • 必须人工决定如何与列表中的条目匹配(例如:允许部分匹配还是坚持完全匹配)
    • 无法解决一词多义问题(例如:芝加哥可代表美国城市,也可代表流行音乐团体名称,列表法难以保证其总是以地理语义出现);
    • 地理引用通常以不同的形式出现,列表中必须同时包含其所有名称的变种才能提高识别精度

3.5 基于知识或规则的地理解析法

  • 工作原理:

    • 通过规则或知识,有效利用上下文环境,提高地理引用的解析能力的方法。
    • 例如:在句子“Keep up on your reading with audio books”中,使用地名辞典的方法可能会将“on”、“reading”和“books”错误地分别识别为越南、英国和美国路易斯安那州的位置。然而,实际上这些条目没有地理意义,此时利用周围的上下文,会显著提高识别精度。
  • 实现方法:

    • 用语法来定义和表达上下文关系,符合语法要求的词或词组,被识别为地理引用
    • 语法表达方式:通过定义封装规则的正则表达式,采用文本匹配方式执行
      • 常见基于正则表达式的模式匹配工具包括:lex、flex、fgrep等
    • 证据(上下文中用于支撑该语法规则的内容,被成为证据):
      • 内部证据(规则):

        • 文本中通常有某种内在的结构或短语可以表明它们是地名
        • 此类特征包括大写、前缀或后缀(如:公司标志)、地名列表等
        • 例如:
          • 内部证据规则设定为:“CapWord + { Street,Boulevard,Avenue,Crescent,Road }”,则利用该规则可以捕获“Portobello Street”和“Sunset Boulevard”等地理引用
      • 外部证据(规则):

        • 文本中可能存在上下文证据能够表明单词或短语是什么类型的实体
          • 例如:“President Washington chopped the tree”,“President”是明确的外部证据,表明“Washington”是一个人的名字,而不是一个地方
        • 通过建立上下文规则来帮助区分命名实体的类别,例如:
          • 正例:“North of CapWord” 规则有助于识别“CapWord”为一个地名
          • 反例:“Firstname + Location”,其中Firstname与常用人名辞典匹配,Location与地名辞典匹配,符合该规则的“Location”,可以推断为假的地理引用
  • 优点:

    • 因为有证据的支撑,准确率较列表查找方法更高
    • 能够找到一些新的地理词汇
  • 缺点:

    • 难以处理无法用正则表达式显式表达的非规则化地理引用
    • 规则无法穷举,太多的规则影响效率
    • 将大小写作为识别规则是常见方法,但对于中文不适用

3.6 基于机器学习的地理解析法

  • 工作原理:

    • 在旧的NER系统中,规则通常是手工创建的
    • 大多数现代方法使用机器学习方法从以前的已分类训练集中捕获上下文信息特征,进而实现自动归纳规则,并用于地理解析
  • 实现方法:

    • 隐马尔可夫模型
    • 决策树
    • 最大熵
    • 神经网络
  • 方法案例:

    • Curran和Clark(2003)使用了最大熵来标记来独立于语言的特征,包括:词性、前后命名实体列表、单词是否为名字或姓氏等。根据CoNLL2003 NE任务的数据,Curran和Clark的F1分数分别为87.7%(英语)、71%(德语)和83.2%(荷兰语)
  • 两个核心问题:

    • 训练数据问题
    • 分类器的泛化能力问题

3.7 半结构化数据的地理解析

​ 许多研究者还讨论了半结构化web内容的地理定位问题(McCurley,2001;Borges et al.,2003;Morimoto,2003)。例如:

  • Borges等人2003年使用Wrapper归纳法从网页中提取地址
  • Wrapper是一种严重依赖于文档格式和标记语言来归纳IE模式的系统,每种格式都需要编写Wrapper
  • 工作原理:通过对已标注的网页样本集进行分析,获取网页文本结构、HTML标记等与地理引用之间的关系,并导出相应的正则表达式,并基于该表达式创建通用的Wrapper,以提取新网页中的地理引用
  • 本质上是通过学习,建立了一种提取规则,而该规则仅依赖于网页结构和标记,不依赖于地名本身

4. 地理编码技术

​ 确定候选的地理引用列表后,后续任务是在地理知识库(例如:地名词典或本体)的上下文中,找到与之向对应的标识符,该标识符一般具有唯一性和地理语义,且与某个度量空间相关联的。(Buscaldi,2011年; Leidner,2008年; Leidner和Lieberman,2011年; Andogah等人,2012年)。 例如:可以为英国西南部的地名“Bath”分配地理坐标(51.3794N,-2.3656W)。

4.1 地理编码面临的问题

对地名进行地理编码的常见问题是引用歧义(或地理歧义):

(1)一词多地

​ 所标识的地名可能仅指向地名词典中具有相同名称的多个位置之一。 例如:

​ 名称“ Chapeltown”是指以下列表中的哪一个?

  • 南约克郡(英国)
  • 兰开夏郡(英国)
  • 肯特县(美国)
  • 帕诺拉县(美国)中的位置

(2)一地多词

​ 不同(或替代)地名可能用于引用相同的位置。例如:

  • “London”和“Londres”都为伦敦
  • 非正式地名或民间地名,例如“NewYork”也可以称为“Big Apple”、“中央电视台办公大楼”也可以成为“大裤衩”

4.2 地理编码的主要技术方法

(1)地理编码的目的

  • 地理编码的目标是为每个歧义地名从一组候选位置中选择一个最匹配的位置。

(2)地理编码的实现原理

  • 通常利用启发式或内建规则,建立地理引用与外部资源(如:地名词典)之间的映射,进而获得空间坐标
  • 例如:
    • 利用与位置相关的其他知识(例如:人口统计、土地表面积、对地点类型的偏好等)来解决歧义
    • 采用已知语言或地理属性约束映射规则:
      • 单一地理引用文档: 明确在整个文本中引用的地名位置保持不变(Leidner,2008年)
      • 托伯勒定律(Tobler,1970):相互靠近的事物之间的联系相较于不靠近的事物而言,联系更为密切。基于启发式的方法通常简单、高效,并且在多个领域中执行得很好(Leidner,2008年)。

(3)四大类典型地名消歧方法

地名消歧方法分为三大类:

  • 基于规则和知识的方法:利用外部知识资源寻找歧义线索,例如:人口统计数据、频率最高的查询统计等
  • 基于机器学习的方法:利用数据驱动或训练集
  • 基于空间距离测度的方法:涉及计算空间距离或语义相似度的方法

4.3 基于规则和知识的方法

(1)工作原理:

​ 在词法或句法匹配基础上,通过其他知识或资源寻找歧义线索,结合相关线索从地名辞典中找出最近似的地名或地址,进而给出相应的空间坐标。

(2)案例分析:

​ 考虑下表中来自Wikipedia的伦敦(英国)和伦敦(加拿大)的示例。 假设在地理解析阶段,“伦敦”已被确定为地点; 下一步是使用地名词典或本体库将其置于物理空间中。 假设已经获得对GeoNames全球地名数据库的访问权限,并将伦敦查找为居住地,结果会发现67个可能的位置,包括:英国的伦敦、加拿大的伦敦、美国的几次伦敦事件等。

![](https://raw.githubusercontent.com/shilang1220/imageBed/master/img/2020-05-28 13-24-42的屏幕截图.png)

(3)解决办法:

  • 办法1:根据外部世界知识为模棱两可的位置指定默认位置

    • 原理:
      • 采用人口、GDP、面积等,作为判别近似地名的外部知识(Andogah等,2012)
    • 输入:
      • 包含其他知识的地名辞典,如:GeoNames包含有关城市的人口数据
    • 测度:
      • 将人口最多的城市作为默认歧义消除规则
    • 输出:
      • 根据GeoNames,人口最多的英国伦敦将是最终结果
    • 注意事项:
      • 通常情况下,当所有其他消除地名歧义的方法都失败时,才会使用默认的意义启发法
      • 例如:基于人口知识对表中第二个案例不起作用,因为无论上下文如何,最终都将返回英国伦敦
  • 办法2:利用上下文线索消除地理歧义

    • 原理:
      • 将上下文中的某些线索作为证据,通过组合的某种测度来识别最为近似的地名
    • 输入:
      • 输入1:同一文本段(例如:段落)中或整个文档中提到的其他地名
        • 例如:在第一个示例中出现“英格兰”或在第二个示例中出现“安大略”和伦敦是包含关系
      • 输入2: 地名词典(例如:地名词库TGN)中相关层次结构的信息
        • 例如: 世界>欧洲>英国>英格兰>大伦敦>伦敦和世界>北美和中美洲>加拿大>安大略省>伦敦
    • 测度:
      • 计算“地名辞典里层次结构中的单词”与“歧义地名前后单词”之间的文本相似性
      • 计算“地名辞典里层次结构中的单词”与“歧义地名前后单词”之间的某种距离度量(如:本体距离)(Amitay等 ,2004)
    • 输出:
      • 地名辞典中文本最相似的地名、或者距离度量值最小的地名
    • 案例:
      • Buscaldi,2011年

4.4 基于机器学习的方法

  • 问题的提出:

    • 基于知识或规则的方法也有局限性,相关地名单词的出现可能不足以成功消除歧义;
    • 一种替代方法是在地名的本地上下文中同时考虑非地理名称(Overell和Rüger,2008年)。
  • 工作原理:

    • 基于特定非地理单词通常会在特定位置的上下文中出现这一假设,使用地名单词周围所有单词进行消歧,利用机器学习方法获得同时出现概率最大的单词,进而扩大了证据的范围
    • 例如:在有关波特兰的文字中,“龙虾”一词将间接暗示缅因州的波特兰,而不是俄勒冈州或密歇根州。
  • 输入:

    • 输入1:地名引用周边某个范围内的所有单词(通常为名词)
    • 输入2: 之前利用监督或非监督方法获取的地名与单词互相关语言模型(如:互相关矩阵)
  • 测度:

    • 上下文中单词与地名辞典中语言模型的相似度
  • 输出:

    • 地名辞典中,多词相关性最大的地名
  • 案例:

    • Speriosu和Baldridge(2013年)使用带有地理标签的Wikipedia文章训练了一种监督学习算法,使用周围文档中的所有单词进行消歧。
    • 该案例将“待判别的地理引用及其上下文词”与“该地名在地名辞典中利用样本生成的语言模型”进行匹配
    • 地名词典中匹配度最高的地名,为最优候选
  • 本质:

    • 该方法将上下文线索作为特征捕获,并将问题视为监督学习或分类任务

4.5 基于地图(空间)的方法

  • 问题的提出:

    • 上述方法的基本原理还是基于词法或句法的相似性分析,并未使用空间语义
    • 需要提前训练语言模型,可能会受样本集的偏置影响,只能限定于某一类语料
  • 基本原理:

    • 基于“文档中的位置在空间上是自相关”的这一假设,使用歧义地名上下文中的地点坐标来进行歧义消除
  • 输入:

    • 上下文中的所有地名地址
    • 地名辞典
  • 测度:

    • 候选地名与上下文中所有地名(在地名辞典中)的坐标形成空间距离
    • 正确的位置将使其与歧义对象之间的距离最小化(Smith和Crane,2001年)
  • 输出:

    • 空间距离测度最小的地名
    • 例如:上表第二个示例中,伦敦的平均位置将在空间上更接近“西南安大略”、“魁北克市-温莎走廊”、“多伦多”、“安大略省”、“底特律”和“密歇根州” ,因此可将其判别为加拿大的伦敦。
  • 案例:

    • 莱德纳(Leidner,2008)为歧义名称的所有候选位置计算距离,并使用空间最小度量选择最接近的位置
    • 明确地名的实例特别有效,它提供了更强力的证据来证明歧义地名(Li等人,2003; Rauch等人,2003)

4.6 其他问题

  • 非点状要素的地理编码问题

    • 如何为无法分配点坐标的要素(例如:泰晤士河)
  • 历史地名的地理编码问题

    • 如何为历史地名作地理编码
    • 例如:Smith and Mann在2003年表明,与当前新闻报道相比,历史文档地理编码成功率要低得多
  • 精细粒度地名的地理编码问题

    • 在更精细粒度下进行地理编码时,还会遇到其他挑战
    • 例如:Pasley等人2007年表明,与区域和城市尺度的数据相比,街道水平显示出更高的歧义度
  • 地名辞典的问题

    • 地名辞典的质量直接影响地理编码的准确率和精度
      • Clough于2005年的研究表明,与使用特定国家/地区的地名进行歧义消除相比,使用全球覆盖但质量参差不齐的资源(如GeoNames)效果要差很多。(Graham等,2015年; Ahlers,2013年)
    • 结合多种地理资源(地名辞典或本体库)更有可能提高消除歧义的能力
    • 对多种地理资源进行汇总或选择合适的资源本身也是一个实体对齐的研究课题(Smart等人,2010年)

5. 文本地理范围的解析

(1)定义

  • Andogah等人(2012年)将文档的地理范围定义为“文档所涉及的地理区域或面积”
  • 地理范围解析是将地理范围自动分配给文档的过程

(2)假设前提

  • 与更一般的文档相比,许多文档与特定区域内的人们相关
    • 例如:英国广播公司(BBC)有关英国的新闻页面与来自英国而不是全世界的人相关性更高

(3)相关分类

  • Ding等人(Ding等,2000年)计算了Web资源(例如网页)的地理范围,以确定目标受众(如:某个城市、国家或世界的居民)

  • Wang等于2005进一步将网络资源的位置分为三类:

    • 提供者位置
      • 提供者位置描述了拥有网络资源的提供商(例如组织或个人)的物理位置,例如,从特定区域内的黄页中识别服务提供商(Himmelstein,2005)
    • 内容位置
      • 内容位置描述了Web资源的内容所在的位置
    • 服务位置
      • 服务位置描述了Web资源到达的地理范围。 例如,谢菲尔德市议会提供的有关缴纳议会税的信息可能仅与谢菲尔德居民有关; 英国政府网站上有关签证的信息将在全球范围内推广。

(4)实现方法

  • 根据主要位置或文档中所有位置的加权平均值来定义文档地理范围
  • 基于文档中的位置排名选择
    • 例如:出现频次最多的地理引用

(5)案例

  • Web-a-Where,NewsStand和STEWARD均涉及到文档定位问题

6. 基于语言模型的隐性位置建模

(1)问题的提出

  • 与显式坐标相关的各种资源不同,机器学习方法的日趋成熟以及对更通用方法的需求在大数据时代催生了一些新的方法。这些方法不是探索文本中明确包含的地名,而是寻求学习在文本中更泛化的位置描述方式。

(2)主要原理

  • 给定与坐标相关联的大量文档(例如:带有地理标签的Flickr标题,Wikipedia或Tweets等),识别与特定空间区域相关联的单词集, 构建面向区域的单词频率语言模型

(3)主要方法

  • Ahern等人(Ahern等,2007)使用k-means聚类和tf-idf修正,从Flickr标签中选择重要的关键词,然后将其分配给单元粒度不同的纬度/经度网格,从而形成浏览系统的基础。该浏览系统用于对地理上特定概念(由语言模型识别)和相关照片进行地理探索。
  • 其他早期应用程序则涉及对没有坐标的Flickr图像进行地理配准。其主要思想是使照片的标签与最相似的语言模型匹配,从而与地理位置匹配,并采用朴素贝叶斯机器学习方法完成。
  • Crandall等人于2009年采用了基于均值漂移聚类的空间细分方法,并使用贝叶斯和支持向量机机器学习方法,使用文本标签和SIFT视觉特征。
  • O’Hare和Murdock,2013年解决了使用贝叶斯方法对Flickr照片进行地理编码的相同问题,仅使用文本证据来证明与多个尺度的常规网格单元相关的语言模型。 在如何分配照片所在单元格的问题,也使用了与目标单元格相邻的网格单元格语言模型。
  • Kinsella等人(2011年)使用语言建模方法来定位推文及其用户,并尝试使用K-L散度和贝叶斯概率将推文与位置进行匹配。 他们发现,在最好的粒度级别上,语言模型明显优于基于地名词典的地理编码方法,这显然是由于用户在推文中包含本地化地名的可能性较高。
  • Wing和Baldridge,2011年将语言模型方法应用到Wikipedia上。他们基于地理标签和落在1度地理网格单元中的文档,通过计算基于文档的语言模型和基于单元的语言模型之间的K-L散度来估计该单元的语言模型。同时他们还研究了利用贝叶斯方法为单元语言模型建模。
  • Roller等人(2012年)进一步开发了该方法,采用聚类方法提供不同大小的网格单元,并通过修改最终匹配步骤,将网格定位点确定为训练集中Wikipedia文章坐标的质心,而非单元格的几何中心。
  • Laere等人,2014年,对Flickr,Twitter和Wikipedia使用k-medoid聚类创建更丰富的语言模型。他们在训练数据中对经过地理编码的资源进行了筛选,最终对Wikipedia文章的地理编码结果有了显著改善。并且,在Wikipedia文章的地理编码中,“点”相关的位置(如建筑物)的地理编码结果,比空间上更宽泛的对象(如河流)质量更高。
  • Melo和Martins在2017年的最新评论中报告了多种方法的结果,其中英语Twitter和Wikipedia数据集的中位数范围为2.2-640 km,平均值在83-2854 km之间。

(4)技术特点

  • 这些方法显示出很好的潜力,并具有一些非常好的结果。
  • 似乎更适合于:
    • 对没有特定位置的地名进行地理编码
    • 对名称很少的资源进行地理编码(例如某些微博)
    • 汇总整个文档集合的地理分布

(5)最新进展

  • DeLozier等人(2015年)提出TopoCluster方法,使用语言建模对地名词典中不存在的地名进行地理定位

    • 使用NER方法识别可能为地名的单词及其上下文单词(例如:前后各10个单词)
    • 事先为地名词典中的所有地名生成其在地球上某个网格内出现的概率或频次
    • 地理编码时:
      • 如果地名词典中存在该地名,则确定为该地名所在网格或最靠近的网格
      • 否则,利用上下文中单词的网格出现概率,来计算最可能与未知地名相关联的网格位置
    • 实验表明,该方法地名定位效果优于Speriosu和Baldridge的方法
  • Gritta等人(2017年)的最新论文比较了几种最先进的地名解析方法

  • 包括:TopoCluster、Edinburgh Geoparser(Grover等人,2010)、GeoTxt(Karimzadeh等人,2013)、 Yahoo!PlaceSpotter和CLAVIN

  • 基于Wikipedia和GeoNames创建了语料库,并利用上述方法进行地理定位

  • 结果Yahoo!PlaceSpotter和Edinburgh Geoparser的性能均优于TopoCluster

7. 发展趋势与挑战

  • 地理解析
    • 中文分词和中文命名实体识别等传统挑战
    • 中文地理空间上下文与中文命名实体识别结合,提升地理解析精度
  • 地理编码
    • 空间相似性测度的最佳方法(如:Janowicz等人提出的地理空间语义相似度)
    • 适应上下文的相关性排序方法

8. 总结

​ 本文讨论了地理定位中的两个核心过程:地理解析和地理编码。从目前的发展趋势来看,两项任务都需要使用上下文信息和地名辞典支撑,以消除歧义并使其成为真正具备空间语义。另外,还讨论了文档地理范围计算、隐性位置建模计算等相关概念。

​ 文中给出的一些案例,特别是针对机器学习方法的测试结论中,大多数差异似乎是由特定领域的方法定制所导致,可能的影响因素包括:不同语言类型、不同语言性质和丰富性、不同地名索引、不同地理配准的粒度要求等。 因此,似乎领域专业知识是研究人员的地理定位技术工具包中不可或缺的一个部分(Leveling,2015)。

参考文献:

Adams, B., G. McKenzie, and M. Gahegan. 2015. “Frankenplace: Interactive Thematic
Mapping for Ad Hoc Exploratory Search”. In: Proceedings of the 24th International
Conference on World Wide Web. WWW ’15. Florence, Italy: International World Wide
Web Conferences Steering Committee. 12–22. doi: 10.1145/2736277.2741137. url:
https://doi.org/10.1145/2736277.2741137.

Ahern, S., M. Naaman, R. Nair, and J. H. Yang. 2007. “World explorer: visualizing aggregate
data from unstructured text in geo-referenced collections”. In: Proceedings of the 7th
ACM/IEEE-CS joint conference on Digital libraries. ACM. 1–10.

Ahlers, D. 2013. “Assessment of the accuracy of geonames gazetteer data”. In: Proceedings
of the 7th Workshop on Geographic Information Retrieval. ACM. 74–81.
Ahmed, S. M. Z., C. McKnight, and C. Oppenheim. 2006. “A user-centred design and evalua-
tion of IR interfaces”. Journal of Librarianship and Information Science. 38(3): 157–172.
doi: 10.1177/0961000606063882. eprint: https://doi.org/10.1177/0961000606063882.
url: https://doi.org/10.1177/0961000606063882.

Allan, J., B. Croft, A. Moffat, and M. Sanderson. 2012. “Frontiers, Challenges, and Op-
portunities for Information Retrieval: Report from SWIRL 2012 the Second Strate-
gic Workshop on Information Retrieval in Lorne”. In: ACM SIGIR Forum. Vol. 46.
No. 1. New York, NY, USA: ACM. 2–32. doi: 10.1145/2215676.2215678. url: http:
//doi.acm.org/10.1145/2215676.2215678.

Alonso, O. and S. Mizzaro. 2009. “Can we get rid of TREC assessors? Using Mechanical Turk
for relevance assessment”. In: SIGIR 2009 Workshop on The Future of IR Evaluation.
15–16.

Aloteibi, S. and M. Sanderson. 2014. “Analyzing geographic query reformulation: An
exploratory study”. Journal of the Association for Information Science and Technology.
65(1): 13–24. doi: 10.1002/asi.22961. url: http://dx.doi.org/10.1002/asi.22961.
Amitay, E., N. Har’El, R. Sivan, and A. Soffer. 2004. “Web-a-where: Geotagging Web
Content”. In: Proceedings of the 27th Annual International ACM SIGIR Conference
on Research and Development in Information Retrieval. SIGIR ’04. Sheffield, United
Kingdom: ACM. 273–280. doi: 10.1145/1008992.1009040. url: http://doi.acm.org/10.
1145/1008992.1009040.

Andogah, G. 2011. Geographically Constrained Information Retrieval: Geographically Intel-
ligent Information Retrieval. Germany: LAP Lambert Academic Publishing.
Andogah, G., G. Bouma, and J. Nerbonne. 2012. “Every Document Has a Geographical
Scope”. Data Knowl. Eng. 81-82(Nov.): 1–20. doi: 10.1016/j.datak.2012.07.002. url:
http://dx.doi.org/10.1016/j.datak.2012.07.002.

Andrade, L. and M. J. Silva. 2006. “Relevance Ranking for Geographic IR.” In: In Proceedings
of GIR’06.94

Armitage, L. H. and P. G. B. Enser. 1997. “Analysis of user need in image archives”.
Journal of Information Science. 23(4): 287–299. doi: 10.1177/016555159702300403.
eprint: https://doi.org/10.1177/016555159702300403. url: https://doi.org/10.1177/
016555159702300403.

Axelrod, A. E. 2003. “On Building a High Performance Gazetteer Database”. In: Proceedings
of the HLT-NAACL 2003 Workshop on Analysis of Geographic References - Volume 1.
HLT-NAACL-GEOREF ’03. Stroudsburg, PA, USA: Association for Computational
Linguistics. 63–68. doi: 10.3115/1119394.1119404. url: http://dx.doi.org/10.3115/
1119394.1119404.

Baeza-Yates, R. A. and B. A. Ribeiro-Neto. 2011. Modern Information Retrieval - the
concepts and technology behind search, Second edition. Pearson Education Ltd., Harlow,
England. url: http://www.mir2ed.org/.

Bailey, P., P. Thomas, N. Craswell, A. P. D. Vries, I. Soboroff, and E. Yilmaz. 2008.
“Relevance assessment: are judges exchangeable and does it matter”. In: Proceedings of
the 31st annual international ACM SIGIR conference on Research and development in
information retrieval. ACM. 667–674.

Bird, S., E. Klein, and E. Loper. 2009. Natural Language Processing with Python. 1st.
O’Reilly Media, Inc.

Borges, K. A. V., A. H. F. Laender, C. B. Medeiros, and A. S. D. Silva. 2003. “The web
as a data source for spatial databases”. In: In Anais do V Brazilian Symposium on
Geoinformatics, Campos do Jordão.

Borlund, P. 2009. “User-Centred Evaluation of Information Retrieval Systems”. In: Infor-
mation Retrieval: Searching in the 21st Century. Ed. by A. Göker and D. J. John Wiley
& Sons. 21–37.

Borlund, P. and P. Ingwersen. 1997. “The development of a method for the evaluation of
interactive information retrieval systems”. Journal of documentation. 53(3): 225–250.

Brisaboa, N. R., M. R. Luaces, Á. S. Places, and D. Seco. 2010. “Exploiting geographic
references of documents in a geographical information retrieval system using an ontology-
based index”. GeoInformatica. 14(3): 307–331.

Brooke, J. 1996. “SUS: a quick and dirty usability scale”. In: Usability evaluation in industry.
Ed. by P. Jordan, B. Thomas, B. Weerdmeester, and I. McClelland.

Brown, T., J. Baldridge, M. Esteva, and W. Xu. 2012. “The substantial words are in
the ground and sea: computationally linking text and geography”. Texas Studies in
Literature and Language. 54(3): 324–339.

Bucher, B., P. Clough, H. Joho, R. Purves, and A. K. Syed. 2005. “Geographic IR sys-
tems: requirements and evaluation”. Proceedings of the 22nd International Cartographic
Conference. 201(2005): 11–16.

Bugayevskiy, L. and J. Snyder. 1995. Map Projections: A Working Manual. CRC Press.
95Buscaldi, D. 2011. “Approaches to Disambiguating Toponyms”. SIGSPATIAL Special. 3(2):
16–19. doi: 10.1145/2047296.2047300. url: http://doi.acm.org/10.1145/2047296.
2047300.

Cai, G. 2002. “GeoVSM: An Integrated Retrieval Model for Geographic Information”.
In: Geographic Information ScienceSecond International Conference, GIScience 2002.
Springer. 70–85.

Cai, G. 2011. “Relevance Ranking in Geographical Information Retrieval”. SIGSPATIAL
Special. 3(2): 33–36. doi: 10.1145/2047296.2047304. url: http://doi.acm.org/10.1145/
2047296.2047304.

Carbonell, J. and J. Goldstein. 1998. “The Use of MMR, Diversity-based Reranking for
Reordering Documents and Producing Summaries”. In: Proceedings of the 21st Annual
International ACM SIGIR Conference on Research and Development in Information
Retrieval. SIGIR ’98. Melbourne, Australia: ACM. 335–336. doi: 10.1145/290941.291025.
url: http://doi.acm.org/10.1145/290941.291025.

Cardoso, N. 2011. “Evaluating Geographic Information Retrieval”. SIGSPATIAL Special.
3(2): 46–53.

Carvalho, V. R., M. Lease, and E. Yilmaz. 2011. “Crowdsourcing for search evaluation”.
SIGIR Forum. 44: 17–22.

Case, D. O. and L. M. Given. 2016. Looking for Information: A Survey of Research on
Information Seeking, Needs, and Behavior. Studies in Information. Emerald Group
Publishing Limited.

Chapelle, O., D. Metlzer, Y. Zhang, and P. Grinspan. 2009. “Expected Reciprocal Rank
for Graded Relevance”. In: Proceedings of the 18th ACM Conference on Information
and Knowledge Management. CIKM ’09. Hong Kong, China: ACM. 621–630. doi:
10.1145/1645953.1646033. url: http://doi.acm.org/10.1145/1645953.1646033.

Chen, L., G. Cong, C. S. Jensen, and D. Wu. 2013. “Spatial keyword query processing:
an experimental evaluation”. In: Proceedings of the 39th international conference on
Very Large Data Bases. PVLDB’13. Trento, Italy: VLDB Endowment. 217–228. url:
http://dl.acm.org/citation.cfm?id=2448948.2448955.

Chen, Y., T. Suel, and A. Markowetz. 2006. “Efficient query processing in geographic web
search engines”. In: SIGMOD Conference. 277–288.

Chin, J. P., V. A. Diehl, and K. L. Norman. 1988. “Development of an Instrument Measuring
User Satisfaction of the Human-computer Interface”. In: Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems. CHI ’88. Washington, D.C., USA:
ACM. 213–218. doi: 10.1145/57167.57203. url: http://doi.acm.org/10.1145/57167.
57203.

Choi, J., C. Hauff, O. V. L. Olivier, and B. Thomee. 2015. “The placing task at mediaeval
2015”. In: MediaEval 2015, Wurzen, Germany, 14-15 September 2015; Ceur Workshop
Proceedings 1436, 2015. CEUR.

Christoforaki, M., J. He, C. Dimopoulos, A. Markowetz, and T. Suel. 2011. “Text vs. Space:
Efficient Geo-search Query Processing”. In: Proceedings of the 20th ACM International
Conference on Information and Knowledge Management. CIKM ’11. Glasgow, Scotland,
UK: ACM. 423–432. doi: 10.1145/2063576.2063641. url: http://doi.acm.org/10.1145/
2063576.2063641.

Cleverdon, C. W. 1991. “The significance of the Cranfield tests on index languages”. In:
Proceedings of the 14th annual international ACM SIGIR conference on Research and
development in information retrieval. SIGIR ’91. 3–12.

Cleverdon, C. W., J. Mills, and M. Keen. 1966. “Factors determining the performance of
indexing systems”. Aslib Cranfield Research Project Cranfield England.

Clough, P. 2005. “Extracting Metadata for Spatially-aware Information Retrieval on the
Internet”. In: Proceedings of the 2005 Workshop on Geographic Information Retrieval.
GIR ’05. Bremen, Germany: ACM. 25–30. doi: 10.1145/1096985.1096992. url: http:
//doi.acm.org/10.1145/1096985.1096992.

Clough, P. D., H. Joho, and R. Purves. 2006. “Judging the Spatial Relevance of Documents
for GIR”. In: Advances in Information Retrieval: 28th European Conference on IR
Research, ECIR 2006, London, UK, April 10-12, 2006. Proceedings. Ed. by M. Lalmas,
A. MacFarlane, S. Rüger, A. Tombros, T. Tsikrika, and A. Yavlinsky. Berlin, Heidelberg:
Springer Berlin Heidelberg. 548–552. doi: 10.1007/11735106_62. url: https://doi.org/
10.1007/11735106_62.

Clough, P. and M. Sanderson. 2013. “Evaluating the performance of information retrieval
systems using test collections.” Information Research. 18(2).

Cohn, A. G. and N. M. Gotts. 1996. “The ’Egg-Yolk’ Representation Of Regions with Inde-
terminate Boundaries”. In: Proceedings, GISDATA Specialist Meeting on Geographical
Objects with Undetermined Boundaries. Francis Taylor. 171–187.
Cole, C. 2011. “A Theory of Information Need for Information Retrieval That Connects

Information to Knowledge”. J. Am. Soc. Inf. Sci. Technol. 62(7): 1216–1231. doi:
10.1002/asi.21541. url: http://dx.doi.org/10.1002/asi.21541.

Cong, G. and C. S. Jensen. 2016. “Querying Geo-Textual Data: Spatial Keyword Queries
and Beyond”. In: Proceedings of the 2016 International Conference on Management of
Data. SIGMOD ’16. San Francisco, California, USA: ACM. 2207–2212. doi: 10.1145/
2882903.2912572. url: http://doi.acm.org/10.1145/2882903.2912572.

Cong, G., C. S. Jensen, and D. Wu. 2009. “Efficient Retrieval of the Top-k Most Relevant
Spatial Web Objects”. Proc. VLDB Endow. 2(1): 337–348. doi: 10.14778/1687627.

url: http://dx.doi.org/10.14778/1687627.1687666.

Coventry, K. R. and S. C. Garrod. 2004. Saying, Seeing and Acting: The Psychological
Semantics of Spatial Prepositions. Essays in Cognitive Psychology. Taylor & Francis.

Crandall, D., L. Backstrom, D. Huttenlocher, and J. Kleinberg. 2009. “Mapping the world’s
photos”. In: Proceedings of the 18th International Conference on World Wide Web. ACM.
761–770.

Curran, J. R. and S. Clark. 2003. “Language Independent NER Using a Maximum Entropy
Tagger”. In: Proceedings of the Seventh Conference on Natural Language Learning
at HLT-NAACL 2003 - Volume 4. CONLL ’03. Edmonton, Canada: Association for
Computational Linguistics. 164–167. doi: 10.3115/1119176.1119200. url: http://dx.doi.
org/10.3115/1119176.1119200.

De Felipe, I., V. Hristidis, and N. Rishe. 2008. “Keyword Search on Spatial Databases”.
In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering.
ICDE ’08. Washington, DC, USA: IEEE Computer Society. 656–665. doi: 10.1109/
ICDE.2008.4497474. url: http://dx.doi.org/10.1109/ICDE.2008.4497474.

De Sabbata, S., O. Alonso, and S. Mizzaro. 2012. “Classical vs. crowdsourcing surveys for
eliciting geographic relevance criteria”. In: IIR 2012 Italian Information Retrieval Work-
shop. Ed. by G. Amati, C. Carpineto, and G. Semeraro. CEUR Workshop Proceedings.
No. 835. Dipartimento di Informatica (DIB), Università di Bari “Aldo Moro”. 65–72.
url: https://doi.org/10.5167/uzh-66808.

De Sabbata, S. and T. Reichenbacher. 2010. “A Probabilistic Model of Geographic Rel-
evance”. In: Proceedings of the 6th Workshop on Geographic Information Retrieval.
GIR ’10. Zurich, Switzerland: ACM. 23:1–23:2. doi: 10.1145/1722080.1722109. url:
http://doi.acm.org/10.1145/1722080.1722109.

De Sabbata, S. and T. Reichenbacher. 2012. “Criteria of geographic relevance: an experi-
mental study”. International Journal of Geographical Information Science. 26(8): 1495–
1520.

DeLozier, G., J. Baldridge, and L. London. 2015. “Gazetteer-independent Toponym Reso-
lution Using Geographic Word Profiles”. In: Proceedings of the Twenty-Ninth AAAI
Conference on Artificial Intelligence. AAAI’15. Austin, Texas: AAAI Press. 2382–2388.
url: http://dl.acm.org/citation.cfm?id=2886521.2886652.

Derungs, C. and R. S. Purves. 2014. “From text to landscape: locating, identifying and
mapping the use of landscape features in a Swiss Alpine corpus”. International Journal of
Geographical Information Science. 28(6): 1272–1293. doi: 10.1080/13658816.2013.772184.
eprint: http://dx.doi.org/10.1080/13658816.2013.772184. url: http://dx.doi.org/10.
1080/13658816.2013.772184.

Derungs, C. and R. S. Purves. 2016. “Mining nearness relations from an n-grams Web
corpus in geographical space”. Spatial Cognition & Computation. 16(4): 301–322.

Ding, J., L. Gravano, and N. Shivakumar. 2000. “Computing Geographical Scopes of Web
Resources”. In: Proceedings of the 26th International Conference on Very Large Data
Bases. VLDB ’00. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. 545–556.
url: http://dl.acm.org/citation.cfm?id=645926.672013.

Drosou, M. and E. Pitoura. 2010. “Search Result Diversification”. SIGMOD Rec. 39(1): 41–1 doi: 10.1145/1860702.1860709. url: http://doi.acm.org/10.1145/1860702.1860709.

Dunlop, M. 2000. “Reflections on Mira: Interactive evaluation in information retrieval”.
Journal of the American Society for Information Science. 51(14): 1269–1274. doi:
10.1002/1097-4571(2000)9999:9999<::AID-ASI1042>3.0.CO;2-7.

Dykes, J., A. M. MacEachren, and M.-J. Kraak. 2005. Exploring geovisualization. Elsevier.

Feng, J., M. Johnston, and S. Bangalore. 2011. “Speech and multimodal interaction in
mobile search”. Signal Processing Magazine, IEEE. 28(4): 40–49.

Ferrès, D. and H. Rodriguez. 2015. “Evaluating geographical knowledge re-ranking, linguistic
processing and query expansion techniques for geographical information retrieval”. In:
International Symposium on String Processing and Information Retrieval. Springer.
311–323.

Fisher, P. 2000. “Sorites paradox and vague geographies”. Fuzzy Sets and Systems. 113(1):
7–18. doi: https : / / doi . org / 10 . 1016 / S0165 - 0114(99 ) 00009 - 3. url: http : / / www .
sciencedirect.com/science/article/pii/S0165011499000093.

Frontiera, P., R. Larson, and J. Radke. 2008. “A Comparison of Geometric Approaches
to Assessing Spatial Similarity for GIR”. Int. J. Geogr. Inf. Sci. 22(3): 337–360. doi:
10.1080/13658810701626293. url: http://dx.doi.org/10.1080/13658810701626293.

Gaio, M., C. Sallaberry, P. Etcheverry, C. Marquesuzaa, and J. Lesbegueries. 2008. “A
global process to access documentsćontents from a geographical point of view”. Journal
of Visual Languages & Computing. 19(1): 3–23.

Gale, W., K. Church, and D. Yarowsky. 1992. “One Sense Per Discourse”. In: Proceedings
of the Workshop on Speech and Natural Language. HLT ’91. Harriman, New York:
Association for Computational Linguistics. 233–237. doi: 10.3115/1075527.1075579.
url: http://dx.doi.org/10.3115/1075527.1075579.

Gan, Q., J. Attenberg, A. Markowetz, and T. Suel. 2008. “Analysis of Geographic Queries in
a Search Engine Log”. In: Proceedings of the First International Workshop on Location
and the Web. LOCWEB ’08. Beijing, China: ACM. 49–56. doi: 10.1145/1367798.1367806.
url: http://doi.acm.org/10.1145/1367798.1367806.

Gao, S., K. Janowicz, D. R. Montello, Y. Hu, J.-A. Yang, G. McKenzie, Y. Ju, L. Gong, B.
Adams, and B. Yan. 2017. “A data-synthesis-driven method for detecting and extracting
vague cognitive regions”. International Journal of Geographical Information Science.
31(6): 1245–1271.

Gey, F. C., R. R. Larson, M. Sanderson, H. Joho, P. Clough, and V. Petras. 2005. “GeoCLEF:
The CLEF 2005 Cross-Language Geographic Information Retrieval Track Overview”.
In: CLEF. 908–919.

Gey, F., R. Larson, M. Sanderson, K. Bischoff, T. Mandl, C. Womser-Hacker, D. Santos,
P. Rocha, and A. Montoyo. 2007. “Challenges to evaluation of multilingual geographic
information retrieval in GeoCLEF”. In: The First International Workshop on Evaluating
Information Access (EVIA). url: http://eprints.whiterose.ac.uk/4535/.

Goodchild, M. F. 2010. “Twenty years of progress: GIScience in 2010”. Journal of Spatial
Information Science. 2010(1): 3–20.

Graham, M. and S. De Sabbata. 2015. “Mapping information wealth and poverty: the
geography of gazetteers”. Environment and Planning A. 47(6): 1254–1264.

Gritta, M., M. T. Pilehvar, N. Limsopatham, and N. Collier. 2017. “What’s missing in
geographical parsing?” Language Resources and Evaluation. Mar. doi: 10.1007/s10579-
017-9385-8. url: https://doi.org/10.1007/s10579-017-9385-8.

Grover, C., R. Tobin, K. Byrne, M. Woollard, J. Reid, S. Dunn, and J. Ball. 2010. “Use
of the Edinburgh geoparser for georeferencing digitized historical collections”. Philo-
sophical Transactions of the Royal Society of London A: Mathematical, Physical and
Engineering Sciences. 368(1925): 3875–3889. doi: 10 . 1098 / rsta . 2010 . 0149. eprint:
http://rsta.royalsocietypublishing.org/content/368/1925/3875.full.pdf. url: http:
//rsta.royalsocietypublishing.org/content/368/1925/3875.

Hall, M. M., P. D. Smart, and C. B. Jones. 2011. “Interpreting Spatial Language in Image
Captions”. Cognitive Processing. 12(1): 67–94. doi: 10.1007/s10339-010-0385-5.
Hariharan, R., B. Hore, C. Li, and S. Mehrotra. 2007a. “Processing Spatial-Keyword (SK)
Queries in Geographic Information Retrieval (GIR) Systems”. In: 19th International
Conference on Scientific and Statistical Database Management (SSDBM 2007). 16–16.
doi: 10.1109/SSDBM.2007.22.

Hariharan, R., B. Hore, C. Li, and S. Mehrotra. 2007b. “Processing Spatial-Keyword (SK)
Queries in Geographic Information Retrieval (GIR) Systems”. In: Proceedings of the 19th
International Conference on Scientific and Statistical Database Management. SSDBM ’07.
Washington, DC, USA: IEEE Computer Society. 16–16. doi: 10.1109/SSDBM.2007.22.
url: http://dx.doi.org/10.1109/SSDBM.2007.22.

Harman, D. 2011. Information Retrieval Evaluation. 1st. Morgan & Claypool Publishers.
Harter, S. P. and C. A. Hert. 1997. “Evaluation of information retrieval systems: Approaches,
issues, and methods.” Annual Review of Information Science and Technology (ARIST).
32: 3–94.

Hearst, M. A. 2009. Search User Interfaces. 1st. New York, NY, USA: Cambridge University
Press.

Henrich, A. and V. Luedecke. 2007. “Characteristics of Geographic Information Needs”.
In: Proceedings of the 4th ACM Workshop on Geographical Information Retrieval.
GIR ’07. Lisbon, Portugal: ACM. 1–6. doi: 10 . 1145 / 1316948 . 1316950. url: http :
//doi.acm.org/10.1145/1316948.1316950.

Herskovits, A. 1986. Language and Spatial Cognition: An Interdisciplinary Study of Prepo-
sitions in English. Cambridge University Press.

Hill, L. L. 2000. “Core elements of digital gazetteers: placenames, categories, and footprints”.
In: Research and advanced technology for digital libraries. Springer. 280–290.

Hill, L. L. 2006. Georeferencing: The Geographic Associations of Information (Digital
Libraries and Electronic Publishing). The MIT Press.

Hill, L. L., L. Carver, M. Larsgaard, R. Dolin, T. R. Smith, J. Frew, and M.-A. Rae. 2000.
“Alexandria digital library: user evaluation studies and system design”. Journal of the
American Society for Information Science. 51(3): 246–259.

Himmelstein, M. 2005. “Local Search: The Internet Is the Yellow Pages”. Computer. 38(2):
26–34. doi: http://doi.ieeecomputersociety.org/10.1109/MC.2005.65.

Hjaltason, G. and H. Samet. 1999. “Distance browsing in spatial databases”. ACM Trans-
actions on Database Systems. 24(2): 265–318.

Hobona, G., P. James, and D. Fairbairn. 2005. “An Evaluation of a Multidimensional Visual
Interface for Geographic Information Retrieval”. In: Proceedings of the 2005 Workshop
on Geographic Information Retrieval. GIR ’05. Bremen, Germany: ACM. 5–8. doi:
10.1145/1096985.1096988. url: http://doi.acm.org/10.1145/1096985.1096988.

Hoeber, O. and X. D. Yang. 2007. “User-Oriented Evaluation Methods for Interactive
Web Search Interfaces”. In: Proceedings of the 2007 IEEE/WIC/ACM International
Conferences on Web Intelligence and Intelligent Agent Technology - Workshops. WI-
IATW ’07. IEEE Computer Society. 239–243.

Hofmann, K., L. Li, and F. Radlinski. 2016. “Online Evaluation for Information Retrieval”.
Foundations and Trends® in Information Retrieval. 10(June): 1–117.
Ide, N. and J. Véronis. 1998. “Introduction to the Special Issue on Word Sense Disambigua-
tion: The State of the Art”. Comput. Linguist. 24(1): 2–40. url: http://dl.acm.org/
citation.cfm?id=972719.972721.

Janowicz, K., M. Raubal, and W. Kuhn. 2011. “The semantics of similarity in geographic
information retrieval”. Journal of Spatial Information Science. 2011(2): 29–57.

Järvelin, K. 2011. “Evaluation”. In: Interactive information seeking, behaviour and retrieval.
Ed. by I. Ruthven and D. Kelly. London, UK: Facet Publishing.

Järvelin, K. and J. Kekäläinen. 2000. “IR Evaluation Methods for Retrieving Highly Relevant
Documents”. In: Proceedings of the 23rd Annual International ACM SIGIR Conference
on Research and Development in Information Retrieval. SIGIR ’00. Athens, Greece: ACM.
41–48. doi: 10.1145/345508.345545. url: http://doi.acm.org/10.1145/345508.345545.

Jones, C. B., A. I. Abdelmoty, D. Finch, G. Fu, and S. Vaid. 2004. “The SPIRIT Spatial
Search Engine: Architecture, Ontologies and Spatial Indexing”. In: Geographic Informa-
tion Science. Ed. by M. J. Egenhofer, C. Freksa, and H. J. Miller. Vol. 3234. Lecture
Notes in Computer Science. Springer Berlin Heidelberg. 125–139. doi: 10.1007/978-3-
540-30231-5_9. url: http://dx.doi.org/10.1007/978-3-540-30231-5_9.

Jones, C. B. and R. S. Purves. 2008. “Geographical Information Retrieval”. Int. J. Geogr.
Inf. Sci. 22(3): 219–228. doi: 10.1080/13658810701626343. url: http://dx.doi.org/10.
1080/13658810701626343.

Jones, C. B., R. S. Purves, P. D. Clough, and H. Joho. 2008. “Modelling Vague Places with
Knowledge from the Web”. Int. J. Geogr. Inf. Sci. 22(10): 1045–1065. doi: 10.1080/1 url: http://dx.doi.org/10.1080/13658810701850547.

Karimzadeh, M., W. Huang, S. Banerjee, J. O. Wallgrün, F. Hardisty, S. Pezanowski,
P. Mitra, and A. M. MacEachren. 2013. “GeoTxt: A Web API to Leverage Place
References in Text”. In: Proceedings of the 7th Workshop on Geographic Information
Retrieval. GIR ’13. Orlando, Florida: ACM. 72–73. doi: 10.1145/2533888.2533942. url:
http://doi.acm.org/10.1145/2533888.2533942.

Karney, C. F. F. 2013. “Algorithm for Geodesics”. Journey of Geodetics. 87(1): 43–55.

Kelly, D. 2009. “Methods for evaluating interactive information retrieval systems with
users”. Foundations and Trends in Information Retrieval. 3(1-2): 1–224.

Keßler, C., K. Janowicz, and M. Bishr. 2009. “An agenda for the next generation gazetteer:
Geographic information contribution and retrieval”. In: Proceedings of the 17th ACM
SIGSPATIAL international conference on advances in Geographic Information Systems.
ACM. 91–100.

Khodaei, A., C. Shahabi, and C. Li. 2010. “Hybrid Indexing and Seamless Ranking of
Spatial and Textual Features of Web Documents”. In: DEXA. 450–466.

Khodaei, A., C. Shahabi, and C. Li. 2012. “SKIF-P: a point-based indexing and ranking of
web documents for spatial-keyword search”. GeoInformatica. 16(3): 563–596.

Kinney, K. A., S. B. Huffman, and J. Zhai. 2008. “How evaluator domain expertise affects
search result relevance judgments”. In: Proceedings of the 17th ACM conference on
Information and knowledge management. CIKM ’08. ACM. 591–598.

Kinsella, S., V. Murdock, and N. O’Hare. 2011. ““I’M Eating a Sandwich in Glasgow”:
Modeling Locations with Tweets”. In: Proceedings of the 3rd International Workshop
on Search and Mining User-generated Contents. SMUC ’11. Glasgow, Scotland, UK:
ACM. 61–68. doi: 10.1145/2065023.2065039. url: http://doi.acm.org/10.1145/2065023.
2065039.

Kreveld, M., I. Reinbacher, A. Arampatzis, and R. Zwol. 2005. “Developments in Spa-
tial Data Handling: 11th International Symposium on Spatial Data Handling”. In:
Berlin, Heidelberg: Springer Berlin Heidelberg. Chap. Distributed Ranking Methods
for Geographic Information Retrieval. 231–243. doi: 10.1007/3-540-26772-7_18. url:
http://dx.doi.org/10.1007/3-540-26772-7_18.

Laere, O. V., S. Schockaert, V. Tanasescu, B. Dhoedt, and C. B. Jones. 2014. “Georeferencing
Wikipedia Documents Using Data from Social Media Sources”. ACM Trans. Inf. Syst.
32(3): 12:1–12:32. doi: 10.1145/2629685. url: http://doi.acm.org/10.1145/2629685.
102Landau, B. and R. Jackendoff. 1993. ““What” and “where” in spatial language and spatial
cognition”. Behavioral and Brain Sciences. 16(2): 217–238.

Larson, R. R. 1996. “Geographic Information Retrieval and Spatial Browsing”. GIS and
Libraries: Patrons, Maps and Spatial Information. Apr.: 81–124. Ed. by L. Smith and
M. Gluck.

Larson, R. R. 2011. “Ranking Approaches for GIR”. SIGSPATIAL Special. 3(2): 37–41.
doi: 10.1145/2047296.2047305. url: http://doi.acm.org/10.1145/2047296.2047305.

Larson, R. R. and P. Frontiera. 2004. “Spatial Ranking Methods for Geographic Information
Retrieval (GIR) in Digital Libraries”. In: Research and Advanced Technology for Digital
Libraries, 8th European Conference, ECDL 2004, Bath, UK, September 12-17, 2004,
Proceedings. 45–56.

Leidner, J. L. 2006. “An evaluation dataset for the toponym resolution task”. Computers,
Environment and Urban Systems. 30(4): 400–417.

Leidner, J. L. 2008. Toponym Resolution in Text : Annotation, Evaluation and Applications
of Spatial Grounding of Place Names. Boca Raton, FL, USA: Universal Press.

Leidner, J. L. and M. D. Lieberman. 2011. “Detecting Geographical References in the Form
of Place Names and Associated Spatial Natural Language”. SIGSPATIAL Special. 3(2):
5–11. doi: 10.1145/2047296.2047298. url: http://doi.acm.org/10.1145/2047296.2047298.

Leveling, J. 2015. “Tagging of Temporal Expressions and Geological Features in Scientific
Articles”. In: Proceedings of the 9th Workshop on Geographic Information Retrieval.
GIR ’15. Paris, France: ACM. 6:1–6:10. doi: 10.1145/2837689.2837701. url: http:
//doi.acm.org/10.1145/2837689.2837701.

Levinson, S. C. 2003a. “Frames of reference”. In: Space in Language and Cognition: Explo-
rations in Cognitive Diversity. Language Culture and Cognition. Cambridge University
Press. 24–61. doi: 10.1017/CBO9780511613609.003.

Levinson, S. C. 2003b. Space in language and cognition: Explorations in cognitive diversity.
Cambridge: CUP.

Lewis, J. R. 1995. “IBM computer usability satisfaction questionnaires: psychometric evalu-
ation and instructions for use”. International Journal of Human-Computer Interaction.
7(1): 57–78.

Li, H. 2011. “A Short Introduction to Learning to Rank”. IEICE Transactions on Information
and Systems. E94.D(10): 1854–1862. doi: 10.1587/transinf.E94.D.1854.

Li, H., R. K. Srihari, C. Niu, and W. Li. 2003. “InfoXtract Location Normalization: A
Hybrid Approach to Geographic References in Information Extraction”. In: Proceedings
of the HLT-NAACL 2003 Workshop on Analysis of Geographic References - Volume 1.
HLT-NAACL-GEOREF ’03. Stroudsburg, PA, USA: Association for Computational
Linguistics. 39–44. doi: 10.3115/1119394.1119400. url: http://dx.doi.org/10.3115/
1119394.1119400.

Li, Z., K. C. K. Lee, B. Zheng, W.-C. Lee, D. L. Lee, and X. Wang. 2011. “IR-Tree: An
Efficient Index for Geographic Document Search.” IEEE Transactions on Knowledge
and Data Engineering. 23(4): 585–599. url: http://dblp.uni-trier.de/db/journals/tkde/
tkde23.html#LiLZLLW11.

Lieberman, M. D., H. Samet, and J. Sankaranarayanan. 2010. “Geotagging with local
lexicons to build indexes for textually-specified spatial data”. In: 2010 IEEE 26th
International Conference on Data Engineering (ICDE 2010). 201–212. doi: 10.1109/
ICDE.2010.5447903.

Lieberman, M. D., H. Samet, J. Sankaranarayanan, and J. Sperling. 2007. “STEWARD:
Architecture of a Spatio-textual Search Engine”. In: Proceedings of the 15th Annual
ACM International Symposium on Advances in Geographic Information Systems. GIS
’07. Seattle, Washington: ACM. 25:1–25:8. doi: 10.1145/1341012.1341045. url: http:
//doi.acm.org/10.1145/1341012.1341045.

Lieberman, M. and H. Samet. 2012. “Adaptive Context Features for Toponym Resolution
in Streaming News”. In: Proceedings of the 35th International ACM SIGIR Conference
on Research and Development in Information Retrieval. SIGIR ’12. Portland, Oregon,
USA: ACM. 731–740. isbn: 978-1-4503-1472-5. doi: 10.1145/2348283.2348381. url:
http://doi.acm.org/10.1145/2348283.2348381.

Liu, T.-Y. 2009. “Learning to Rank for Information Retrieval”. Foundations and Trends in
Information Retrieval. 3(3): 225–331. doi: 10.1561/1500000016. url: http://dx.doi.org/
10.1561/1500000016.

Longley, P. A., M. F. Goodchild, D. J. Maguire, and D. W. Rhind. 2015. Geographic
information science and systems. John Wiley & Sons.

Lowe, D. G. 2004. “Distinctive image features from scale-invariant keypoints”. International
journal of computer vision. 60(2): 91–110.

MacEachren, A. M. 1995. How maps work: representation, visualization, and design. Guilford
Press.

Mackaness, W. A., A. Ruas, and L. T. Sarjakoski. 2011. Generalisation of geographic
information: cartographic modelling and applications. Elsevier.

Mandl, T. 2011. “Evaluating GIR: Geography-oriented or User-oriented?” SIGSPATIAL
Special. 3(2): 42–45.

Mandl, T., P. Carvalho, G. M. Di Nunzio, F. Gey, R. R. Larson, D. Santos, and C.
Womser-Hacker. 2008a. “GeoCLEF 2008: the CLEF 2008 cross-language geographic
information retrieval track overview”. In: Evaluating Systems for Multilingual and
Multimodal Information Access. Springer. 808–821.

Mandl, T., F. Gey, G. (Di Nunzio), N. Ferro, M. Sanderson, D. Santos, and C. Womser-
Hacker. 2008b. “An Evaluation Resource for Geographic Information Retrieval”. In:
Proceedings of the Sixth International Conference on Language Resources and Evaluation
(LREC’08). Ed. by N. C. ( Chair), K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S.
Piperidis, and D. Tapias. http://www.lrec-conf.org/proceedings/lrec2008/. Marrakech,
Morocco: European Language Resources Association (ELRA).

Manguinhas, H., B. Martins, and J. Borbinha. 2008. “A geo-temporal web gazetteer
integrating data from multiple sources”. In: Digital Information Management, 2008.
ICDIM 2008. Third International Conference on. IEEE. 146–153.

Manning, C. D., P. Raghavan, and H. Schütze. 2008. Introduction to Information Retrieval.
New York, NY, USA: Cambridge University Press.

Mao, J., Y. Liu, K. Zhou, J.-Y. Nie, J. Song, M. Zhang, S. Ma, J. Sun, and H. Luo. “When Does Relevance Mean Usefulness and User Satisfaction in Web Search?”
In: Proceedings of the 39th International ACM SIGIR Conference on Research and
Development in Information Retrieval. SIGIR ’16. Pisa, Italy: ACM. 463–472. doi:
10.1145/2911451.2911507. url: http://doi.acm.org/10.1145/2911451.2911507.

Marchionini, G. 2006. “Exploratory Search: From Finding to Understanding”. Commun.
ACM. 49(4): 41–46. doi: 10.1145/1121949.1121979. url: http://doi.acm.org/10.1145/
1121949.1121979.

Mark, D. M., D. Comas, M. . Egenhofer, S. M. Freundschuh, M. D. Gould, and J. Nunes. “Evaluating and refining computational models of spatial relations through cross-
linguistic human-subjects testing”. In: Spatial Information Theory A Theoretical Basis
for GIS. Vol. 988/1995. Lecture Notes in Computer Science. Springer Berlin / Heidelberg.
553–568.

Martins, B. 2011. “A supervised machine learning approach for duplicate detection over
gazetteer records”. In: GeoSpatial Semantics. Springer. 34–51.

Martins, B. and P. Calado. 2010. “Learning to Rank for Geographic Information Retrieval”.
In: Proceedings of the 6th Workshop on Geographic Information Retrieval. GIR ’10.
Zurich, Switzerland: ACM. 21:1–21:8. doi: 10 . 1145 / 1722080 . 1722107. url: http :
//doi.acm.org/10.1145/1722080.1722107.

Martins, B., M. J. Silva, and M. S. Chaves. 2005. “Challenges and Resources for Evaluating
Geographical IR”. In: Proceedings of the 2005 Workshop on Geographic Information
Retrieval. GIR ’05. Bremen, Germany: ACM. 65–69.

Maskari, A., M. Sanderson, and P. Clough. 2007. “The Relationship Between IR Effective-
ness Measures and User Satisfaction”. In: Proceedings of the 30th Annual International
ACM SIGIR Conference on Research and Development in Information Retrieval. SI-
GIR ’07. Amsterdam, The Netherlands: ACM. 773–774. isbn: 978-1-59593-597-7. doi:
10.1145/1277741.1277902. url: http://doi.acm.org/10.1145/1277741.1277902.

McCurley, K. S. 2001. “Geospatial Mapping and Navigation of the Web”. In: Proceedings of
the 10th International Conference on World Wide Web. WWW ’01. Hong Kong, Hong
Kong: ACM. 221–229. doi: 10.1145/371920.372056.

Melo, F. and B. Martins. 2017. “Automated Geocoding of Textual Documents: A Survey of
Current Approaches”. Transactions in GIS. 21(1): 3–38.

Mikheev, A., M. Moens, and C. Grover. 1999. “Named Entity Recognition Without
Gazetteers”. In: Proceedings of the Ninth Conference on European Chapter of the
Association for Computational Linguistics. EACL ’99. Bergen, Norway: Association for
Computational Linguistics. 1–8. doi: 10.3115/977035.977037. url: http://dx.doi.org/
10.3115/977035.977037.

Montello, D. R. 2016. “Cognition and Spatial Behavior”. In: International Encyclopedia of
Geography: People, the Earth, Environment and Technology. John Wiley & Sons, Ltd.
doi: 10.1002/9781118786352.wbieg0498. url: http://dx.doi.org/10.1002/9781118786352.
wbieg0498.

Montello, D. R., M. F. Goodchild, J. Gottsegen, and P. Fohl. 2003. “Where’s Downtown?:
Behavioral Methods for Determining Referents of Vague Spatial Queries”. Spatial
Cognition & Computation. 3(3): 185–104.

Morimoto, Y., M. Aono, M. E. Houle, and K. S. McCurley. 2003. “Extracting spatial
knowledge from the web”. In: 2003 Symposium on Applications and the Internet, 2003.
Proceedings. 326–333. doi: 10.1109/SAINT.2003.1183066.

Morville, P. and L. Rosenfeld. 2006. Information Architecture for the World Wide Web.
O’Reilly Media, Inc.

Nadeau, D. and S. Sekine. 2007. “A Survey of Named Entity Recognition and Classification”.
Journal of Linguisticae Investigationes. 30(1): 1–20. url: http://nlp.cs.nyu.edu/sekine/
papers/li07.pdf.

Nielsen, J. 1993. Usability Engineering. San Francisco, CA, USA: Morgan Kaufmann
Publishers Inc.

O’Hare, N. and V. Murdock. 2013. “Modeling locations with social media”. Information
Retrieval. 16(1): 30–62.

O’Sullivan, D. and D. Unwin. 2014. Geographic information analysis. John Wiley & Sons.
Opach, T., I. Golebiowska, and S. I. Fabrikant. 2013. “How Do People View Multi-
Component Animated Maps?” The Cartograhic Journal. online(Oct.). doi: dx . doi .
org/10.1179/1743277413Y.0000000049.

Overell, S. and S. Rüger. 2008. “Using co-occurrence models for placename disambiguation”.
International Journal of Geographical Information Science. 22(3): 265–287.
Palacio, D., G. Cabanac, C. Sallaberry, and G. Hubert. 2010. “On the Evaluation of
Geographic Information Retrieval Systems: Evaluation Framework and Case Study”.
Int. J. Digit. Libr. 11(2): 91–109. doi: 10.1007/s00799-011-0070-z. url: http://dx.doi.
org/10.1007/s00799-011-0070-z.

Palacio, D., C. Derungs, and R. Purves. 2015. “Development and evaluation of a geographic
information retrieval system using fine grained toponyms”. Journal of Spatial Information
Science. (11): 1–29.

Pasley, R. C., P. D. Clough, and M. Sanderson. 2007. “Geo-tagging for Imprecise Regions of
Different Sizes”. In: Proceedings of the 4th ACM Workshop on Geographical Information
Retrieval. GIR ’07. Lisbon, Portugal: ACM. 77–82. doi: 10.1145/1316948.1316969. url:
http://doi.acm.org/10.1145/1316948.1316969.

Purves, R. S. and P. D. Clough. 2006. “Judging spatial relevance and document location for
Geographic Information Retrieval”. In: In Proceedings of 4th International Conference
on Geographic Information Science (GIScience 2006). 159–164.

Purves, R. S., P. Clough, C. B. Jones, A. Arampatzis, B. Bucher, D. Finch, G. Fu, H.
Joho, A. K. Syed, S. Vaid, and B. Yang. 2007. “The Design and Implementation of
SPIRIT: A Spatially Aware Search Engine for Information Retrieval on the Internet”.
Int. J. Geogr. Inf. Sci. 21(7): 717–745. doi: 10.1080/13658810601169840. url: http:
//dx.doi.org/10.1080/13658810601169840.

Purves, R. S., A. Edwardes, and M. Sanderson. 2008. “Describing the where–improving
image annotation and search through geography”. In: Proceedings of the workshop on
Metadata Mining for Image Understanding (MMIU 2008). Sheffield.

Raper, J. 2007. “Geographic relevance”. Journal of Documentation. 63(6): 836–852.

Rapp, R. H. 1993. “Geometric geodesy, part II, Technical report,” tech. rep. Ohio State
Univ. url: http://hdl.handle.net/1811/24409.

Rauch, E., M. Bukatin, and K. Baker. 2003. “A Confidence-based Framework for Disam-
biguating Geographic Terms”. In: Proceedings of the HLT-NAACL 2003 Workshop on
Analysis of Geographic References - Volume 1. HLT-NAACL-GEOREF ’03. Stroudsburg,
PA, USA: Association for Computational Linguistics. 50–54. doi: 10.3115/1119394. url: http://dx.doi.org/10.3115/1119394.1119402.

Recchia, G. and M. M. Louwerse. 2013. “A Comparison of String Similarity Measures for
Toponym Matching.” In: Proceedings of The First ACM SIGSPATIAL International
Workshop on Computational Models of Place. COMP ’13. Orlando FL, USA: ACM.
54:54–54:61. doi: 10.1145/2534848.2534850. url: http://doi.acm.org/10.1145/2534848.
2534850.

Reichenbacher, T., S. De Sabbata, R. S. Purves, and S. I. Fabrikant. 2016. “Assessing
geographic relevance for mobile search: A computational model and its validation via
crowdsourcing”. Journal of the Association for Information Science and Technology.
67(11): 2620–2634. doi: 10.1002/asi.23625. url: http://dx.doi.org/10.1002/asi.23625.

Robertson, S. E. 1981. “The methodology of information retrieval experiment”. In: Infor-
mation retrieval experiment. Butterworths. 9–31.

Robertson, S. E. and M. M. Hancock-Beaulieu. 1992. “On the Evaluation of IR Systems”.
Inf. Process. Manage. 28(4): 457–466. doi: 10 . 1016 / 0306 - 4573(92 ) 90004 - J. url:
http://dx.doi.org/10.1016/0306-4573(92)90004-J.

Robertson, S. and H. Zaragoza. 2009. “The Probabilistic Relevance Framework: BM25 and
Beyond”. Foundations and Trends Information Retrieval. 3(4): 333–389. doi: 10.1561/ url: http://dx.doi.org/10.1561/1500000019.

Rocha-Junior, J. B., O. Gkorgkas, S. Jonassen, and K. Nørvåg. 2011. “Efficient Processing
of Top-k Spatial Keyword Queries”. In: Proceedings of the 12th International Conference
on Advances in Spatial and Temporal Databases. SSTD’11. Minneapolis, MN: Springer-
Verlag. 205–222. url: http://dl.acm.org/citation.cfm?id=2035253.2035270.

Rodden, K., H. Hutchinson, and X. Fu. 2010. “Measuring the User Experience on a Large
Scale: User-centered Metrics for Web Applications”. In: Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems. CHI ’10. Atlanta, Georgia, USA:
ACM. 2395–2398. doi: 10.1145/1753326.1753687. url: http://doi.acm.org/10.1145/
1753326.1753687.

Roller, S., M. Speriosu, S. Rallapalli, B. Wing, and J. Baldridge. 2012. “Supervised text-
based geolocation using language models on an adaptive grid”. In: Proceedings of the
2012 Joint Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning. Association for Computational Linguistics.
1500–1510.

Russell-Rose, T. and T. Tate. 2013. Designing the Search Experience: The Information
Architecture of Discovery. 1st. San Francisco, CA, USA: Morgan Kaufmann Publishers
Inc.

Sanderson, M. 2010. “Test Collection Based Evaluation of Information Retrieval Sys-
tems”. Foundations and Trends in Information Retrieval. 4(4): 247–375. doi: 10.1561/ url: http://dx.doi.org/10.1561/1500000009.

Sanderson, M. and J. Kohler. 2004. “Analyzing geographic queries”. In: Proceedings of the
Workshop on Geographic Information Retrieval. Sheffield.

Santos, D., L. M. Cabral, C. Forascu, P. Forner, F. C. Gey, K. Lamm, T. Mandl, P. Osenova,
A. Peñas, Á. Rodrigo, J. M. Schulz, Y. Skalban, and E. F. T. K. Sang. 2010. “GikiCLEF:
Crosscultural Issues in Multilingual Information Access.” In: Proceedings of the Seventh
conference on International Language Resources and Evaluation (LREC’10). Ed. by
N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner,
and D. Tapias. European Languages Resources Association (ELRA). 2346–2353.

Santos, D., N. Cardoso, P. Carvalho, I. Dornescu, S. Hartrumpf, J. Leveling, and Y. Skalban.“GikiP at GeoCLEF 2008: Joining GIR and QA forces for querying Wikipedia”. In:
Evaluating Systems for Multilingual and Multimodal Information Access: 9th Workshop
of the Cross-Language Evaluation Forum. Ed. by C. Peters, T. Deselaers, N. Ferro,
J. Gonzalo, G. J. F. Jones, M. Kurimo, T. Mandl, A. Peñas, and V. Petras. Vol. 5706.
Lecture Notes in Computer Science (LNCS). Springer. 894–905.

Santos, R. L. T., C. Macdonald, and I. Ounis. 2015. “Search Result Diversification”.
Foundations and Trends in Information Retrieval. 9(1): 1–90. doi: 10.1561/1500000040.
url: http://dx.doi.org/10.1561/1500000040.

Saracevic, T. 1995. “Evaluation of Evaluation in Information Retrieval”. In: Proceedings of
the 18th Annual International ACM SIGIR Conference on Research and Development
in Information Retrieval. SIGIR ’95. Seattle, Washington, USA: ACM. 138–146. doi:
10.1145/215206.215351.

Saracevic, T. 1996. “Relevance reconsidered”. In: Information science: Integration in perspec-
tives. Proceedings of the Second Conference on Conceptions of Library and Information
Science. 201–218.

Schockaert, S. 2011. “Vague Regions in Geographic Information Retrieval”. SIGSPATIAL
Special. 3(2): 24–28. doi: 10.1145/2047296.2047302. url: http://doi.acm.org/10.1145/
2047296.2047302.

Sehgal, V., L. Getoor, and P. D. Viechnicki. 2006. “Entity resolution in geospatial data
integration”. In: Proceedings of the 14th annual ACM international symposium on
Advances in geographic information systems. ACM. 83–90.

Shatford, S. 1986. “Analyzing the subject of a picture: a theoretical approach”. Cataloging
& classification quarterly. 6(3): 39–62.

Shaw, B., J. Shea, S. Sinha, and A. Hogue. 2013. “Learning to Rank for Spatiotemporal
Search”. In: Proceedings of the Sixth ACM International Conference on Web Search and
Data Mining. WSDM ’13. Rome, Italy: ACM. 717–726. doi: 10.1145/2433396.2433485.
url: http://doi.acm.org/10.1145/2433396.2433485.

Shneiderman, B. 1996. “The eyes have it: A task by data type taxonomy for information
visualizations”. In: Visual Languages, 1996. Proceedings., IEEE Symposium on. IEEE.
336–343.

Shneiderman, B., D. Byrd, and W. B. Croft. 1998. “Sorting out Searching: A User-interface
Framework for Text Searches”. Commun. ACM. 41(4): 95–98. doi: 10.1145/273035. url: http://doi.acm.org/10.1145/273035.273069.

Smart, P. D., C. B. Jones, and F. A. Twaroch. 2010. “Multi-source Toponym Data Integration
and Mediation for a Meta-Gazetteer Service”. In: Geographic Information Science: 6th
International Conference, GIScience 2010, Zurich, Switzerland, September 14-17, 2010.
Proceedings. Ed. by S. I. Fabrikant, T. Reichenbacher, M. van Kreveld, and C. Schlieder.
Berlin, Heidelberg: Springer Berlin Heidelberg. 234–248. doi: 10.1007/978-3-642-15300-
6_17. url: https://doi.org/10.1007/978-3-642-15300-6_17.

Smith, D. A. and G. Crane. 2001. “Disambiguating geographic names in a historical digital
library”. In: Research and Advanced Technology for Digital Libraries. Springer. 127–136.

Smith, D. A. and G. S. Mann. 2003. “Bootstrapping Toponym Classifiers”. In: Proceedings
of the HLT-NAACL 2003 Workshop on Analysis of Geographic References - Volume 1.
HLT-NAACL-GEOREF ’03. Stroudsburg, PA, USA: Association for Computational
Linguistics. 45–49. doi: 10.3115/1119394.1119401. url: http://dx.doi.org/10.3115/
1119394.1119401.

Speriosu, M. and J. Baldridge. 2013. “Text-Driven Toponym Resolution using Indirect
Supervision”. In: Proceedings of the 51st Annual Meeting of the Association for Com-
putational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 1: Long
Papers. 1466–1476. url: http://aclweb.org/anthology/P/P13/P13-1144.pdf.

Spink, A., B. J. Jansen, D. Wolfram, and T. Saracevic. 2002. “From e-sex to e-commerce:
Web search changes”. Computer. 35(3): 107–109. doi: 10.1109/2.989940.

Spink, A., D. Wolfram, M. B. J. Jansen, and T. Saracevic. 2001. “Searching the web:
The public and their queries”. Journal of the Association for Information Science and
Technology. 52(3): 226–234.

Stokes, N., Y. Li, A. Moffat, and J. Rong. 2008. “An empirical study of the effects of NLP
components on Geographic IR performance”. Int. J. Geogr. Inf. Sci. 22(3): 247–264.

Su, L. T. 2003. “A comprehensive and systematic model of user evaluation of Web search
engines: I. Theory and background”. J. Am. Soc. Inf. Sci. Technol. 54(13): 1175–1192.
doi: 10.1002/asi.10303. url: http://dx.doi.org/10.1002/asi.10303.

Sutcliffe, A. and M. Ennis. 1998. “Towards a cognitive theory of information retrieval”.
Interacting with Computers. 10(3): 321–351. {HCI} and Information Retrieval. doi:
http://dx.doi.org/10.1016/S0953-5438(98)00013-7. url: http://www.sciencedirect.com/
science/article/pii/S0953543898000137.

Talmy, L. 1983. “How Language Structures Space”. In: Spatial Orientation. New York:
Plenum. 225–282.

Tang, J. and M. Sanderson. 2010. “Evaluation and User Preference Study on Spatial
Diversity”. In: Advances in Information Retrieval: 32nd European Conference on IR
Research, ECIR 2010, Milton Keynes, UK, March 28-31, 2010.Proceedings. Ed. by C.

Gurrin, Y. He, G. Kazai, U. Kruschwitz, S. Little, T. Roelleke, S. Rüger, and K. Van
Rijsbergen. Berlin, Heidelberg: Springer Berlin Heidelberg. 179–190. doi: 10.1007/978-
3-642-12275-0_18. url: https://doi.org/10.1007/978-3-642-12275-0_18.

Teitler, B. E., M. D. Lieberman, D. Panozzo, J. Sankaranarayanan, H. Samet, and J.
Sperling. 2008. “NewsStand: A New View on News”. In: Proceedings of the 16th ACM
SIGSPATIAL International Conference on Advances in Geographic Information Systems.
GIS ’08. Irvine, California: ACM. 18:1–18:10. doi: 10.1145/1463434.1463458. url: http:
//doi.acm.org/10.1145/1463434.1463458.

Thomas, P. and D. Hawking. 2006. “Evaluation by Comparing Result Sets in Context”. In:
Proceedings of the 15th ACM International Conference on Information and Knowledge
Management. CIKM ’06. Arlington, Virginia, USA: ACM. 94–101. doi: 10.1145/1183614. url: http://doi.acm.org/10.1145/1183614.1183632.

Tobler, W. R. 1970. “A Computer Movie Simulating Urban Growth in the Detroit Region”.
Economic Geography. 46: 234–240. url: http://www.jstor.org/stable/143141.

Uryupina, O. 2003. “Semi-supervised Learning of Geographical Gazetteers from the Inter-
net”. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic
References - Volume 1. HLT-NAACL-GEOREF ’03. Stroudsburg, PA, USA: Asso-
ciation for Computational Linguistics. 18–25. doi: 10.3115/1119394.1119397. url:
http://dx.doi.org/10.3115/1119394.1119397.

Vaid, S., C. B. Jones, H. Joho, and M. Sanderson. 2005. “Spatio-textual Indexing for
Geographical Search on the Web”. In: Advances in Spatial and Temporal Databases:
9th International Symposium, SSTD 2005, Angra dos Reis, Brazil, August 22-24, 2005.
Proceedings. Ed. by C. Bauzer Medeiros, M. J. Egenhofer, and E. Bertino. Berlin,
Heidelberg: Springer Berlin Heidelberg. 218–235. doi: 10.1007/11535331_ 13. url:
https://doi.org/10.1007/11535331_13.

Vakkari, P. 2012. “Evaluating Interactive Information Retrieval Systems”. Revista PRISMA.COM.
2012(19): 1–15. url: http://revistas.ua.pt/index.php/prismacom/article/view/2410.

Vakkari, P. and S. Huuskonen. 2012. “Search effort degrades search output but improves
task outcome”. Journal of the American Society for Information Science and Technology.
63(4): 657–670. doi: 10.1002/asi.21683. url: http://dx.doi.org/10.1002/asi.21683.

Van Rijsbergen, C. J. 1979. Information Retrieval. 2nd. Newton, MA, USA: Butterworth-
Heinemann.

Vaughan, M. W. and M. L. Resnick. 2006. “Search User Interfaces: Best Practices and
Future Visions”. J. Am. Soc. Inf. Sci. Technol. 57(6): 777–780. doi: 10.1002/asi.v57:6.
url: http://dx.doi.org/10.1002/asi.v57:6.

Voorhees, E. M. and D. K. Harman. 2005. TREC: Experiment and Evaluation in Information
Retrieval (Digital Libraries and Electronic Publishing). The MIT Press.

Wallgrün, J. O., M. Karimzadeh, A. M. MacEachren, and S. Pezanowski. 2017. “GeoCorpora:
building a corpus to test and train microblog geoparsers”. International Journal of
Geographical Information Science. 0(0): 1–29. doi: 10.1080/13658816.2017.1368523.
eprint: http://dx.doi.org/10.1080/13658816.2017.1368523. url: http://dx.doi.org/10.
1080/13658816.2017.1368523.

Wang, C., X. Xie, L. Wang, Y. Lu, and W.-Y. Ma. 2005. “Web Resource Geographic
Location Classification and Detection”. In: Special Interest Tracks and Posters of the 14th
International Conference on World Wide Web. WWW ’05. Chiba, Japan: ACM. 1138–1 doi: 10.1145/1062745.1062907. url: http://doi.acm.org/10.1145/1062745.1062907.

Wang, W. and K. Stewart. 2015. “Creating spatiotemporal semantic maps from web text
documents”. In: Space-Time Integration in Geography and GIScience. Springer. 157–174.

White, R. W. 2016. Interactions with Search Systems. Cambridge University Press. doi:
10.1017/CBO9781139525305. url: http://dx.doi.org/10.1017/CBO9781139525305.

Wilkening, J. and S. I. Fabrikant. 2013. “How Users Interact With a 3D Geo-Browser under
Time Pressure”. Cartography and Geographic Information Science. 40: 40–52.

Wilson, M. L. 2011. Search User Interface Design. Morgan & Claypool Publishers.

Wing, B. P. and J. Baldridge. 2011. “Simple supervised document geolocation with geodesic
grids”. In: Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics: Human Language Technologies-Volume 1. Association for Computational
Linguistics. 955–964.

Woodruff, A. G. and C. Plaunt. 1994. “GIPSY: Automated Geographic Indexing of Text
Documents”. J. Am. Soc. Inf. Sci. 45(9): 645–655. doi: 10.1002/(SICI)1097-4571(199410)
45:9<645::AID-ASI2>3.0.CO;2-8.

Wu, D., G. Cong, and C. S. Jensen. 2012. “A Framework for Efficient Spatial Web Object
Retrieval”. The VLDB Journal. 21(6): 797–822. doi: 10.1007/s00778-012-0271-0. url:
http://dx.doi.org/10.1007/s00778-012-0271-0.

Xiao, X., Q. Luo, Z. Li, X. Xie, and W.-Y. Ma. 2010. “A Large-scale Study on Map
Search Logs”. ACM Trans. Web. 4(3): 8:1–8:33. doi: 10.1145/1806916.1806917. url:
http://doi.acm.org/10.1145/1806916.1806917.

Yan, H., S. Ding, and T. Suel. 2009. “Inverted Index Compression and Query Processing with
Optimized Document Ordering”. In: Proceedings of the 18th International Conference
on World Wide Web. WWW ’09. Madrid, Spain: ACM. 401–410. doi: 10.1145/1526709.url: http://doi.acm.org/10.1145/1526709.1526764.

Zaila, Y. L. and D. Montesi. 2015. “Geographic Information Extraction, Disambiguation and
Ranking Techniques”. In: Proceedings of the 9th Workshop on Geographic Information
Retrieval. GIR ’15. Paris, France: ACM. 11:1–11:7. doi: 10.1145/2837689.2837695. url:
http://doi.acm.org/10.1145/2837689.2837695.

Zandbergen, P. A. 2008. “A comparison of address point, parcel and street geocoding
techniques”. Computers, Environment and Urban Systems. 32(3): 214–232.

Zhai, C. X., W. W. Cohen, and J. Lafferty. 2003. “Beyond Independent Relevance: Methods
and Evaluation Metrics for Subtopic Retrieval”. In: Proceedings of the 26th Annual
International ACM SIGIR Conference on Research and Development in Information
Retrieval. SIGIR ’03. Toronto, Canada: ACM. 10–17. doi: 10.1145/860435.860440. url:
http://doi.acm.org/10.1145/860435.860440.

Zhang, C., Y. Zhang, W. Zhang, and X. Lin. 2013. “Inverted linear quadtree: Efficient
top k spatial keyword search”. In: 2013 IEEE 29th International Conference on Data
Engineering (ICDE). 901–912. doi: 10.1109/ICDE.2013.6544884.

Zhou, Y., X. Xie, C. Wang, Y. Gong, and W.-Y. Ma. 2005. “Hybrid Index Structures for
Location-based Web Search”. In: Proceedings of the 14th ACM International Conference
on Information and Knowledge Management. CIKM ’05. Bremen, Germany: ACM. 155–1 doi: 10.1145/1099554.1099584. url: http://doi.acm.org/10.1145/1099554.1099584.