geograpy package


geograpy.extraction module

class geograpy.extraction.Extractor(text=None, url=None, debug=False)[source]

Bases: object

Extract geo context for text or from url

find_entities(labels=['GPE', 'GSP', 'PERSON', 'ORGANIZATION'])[source]

Find entities with the given labels set self.places and returns it :param labels: Labels: The labels to filter

Returns:List of places
Return type:list

Find geographic entities

Returns:List of places
Return type:list

Setter for text

split(delimiter=', ')[source]

simpler regular expression splitter with not entity check

hat tip:

geograpy.labels module

Created on 2020-09-10

@author: wf

class geograpy.labels.Labels[source]

Bases: object

NLTK labels

default = ['GPE', 'GSP', 'PERSON', 'ORGANIZATION']
geo = ['GPE', 'GSP']

geograpy.locator module

The locator module allows to get detailed city information including the region and country of a city from a location string.

Examples for location strings are:

Amsterdam, Netherlands Vienna, Austria Vienna, IL Paris - Texas Paris TX

the locator will lookup the cities and try to disambiguate the result based on the country or region information found.

The results in string representationa are:

Amsterdam (NH(North Holland) - NL(Netherlands)) Vienna (9(Vienna) - AT(Austria)) Vienna (IL(Illinois) - US(United States)) Paris (TX(Texas) - US(United States)) Paris (TX(Texas) - US(United States))

Each city returned has a city.region and attribute with the details of the city.

Created on 2020-09-18

@author: wf

class geograpy.locator.City[source]

Bases: object

a single city as an object

static fromGeoLite2(record)[source]
setValue(name, record)[source]

set a field value with the given name to the given record dicts corresponding entry or none

  • name (string) – the name of the field
  • record (dict) – the dict to get the value from
class geograpy.locator.Country[source]

Bases: object

a country

static fromGeoLite2(record)[source]

create a country from a geolite2 record

static fromPyCountry(pcountry)[source]
Parameters:pcountry (PyCountry) – a country as gotten from pycountry
Returns:the country
Return type:Country
class geograpy.locator.Locator(db_file=None, correctMisspelling=False, debug=False)[source]

Bases: object

location handling


find cities with the given cityName

Parameters:cityName (string) – the potential name of a city
Returns:a list of city records

correct potential misspellings :param name: the name of the country potentially misspelled :type name: string

Returns:correct name of unchanged
Return type:string

check whether the database has data / is populated

Returns:True if the cities table exists and has more than one record
Return type:boolean
db_recordCount(tableList, tableName)[source]

count the number of records for the given tableName

  • tableList (list) – the list of table to check
  • tableName (str) – the name of the table to check
int: the number of records found for the table
disambiguate(country, regions, cities, byPopulation=True)[source]

try determining country, regions and city from the potential choices

  • country (Country) – a matching country found
  • regions (list) – a list of matching Regions found
  • cities (list) – a list of matching cities found

the found city or None

Return type:



get the aliases hashTable


get the country for the given name :param name: the name of the country to lookup :type name: string

Returns:the country if one was found or None if not
Return type:country

get the Geolite2 City-Locations as a list of Dicts

Returns:a list of Geolite2 City-Locator dicts
Return type:list
static getInstance(correctMisspelling=False, debug=False)[source]

get the singleton instance of the Locator. If parameters are changed on further calls the initial parameters will still be in effect since the original instance will be returned!

  • correctMispelling (bool) – if True correct typical misspellings
  • debug (bool) – if True show debug information

get the view to be used

Returns:the SQL view to be used for CityLookups e.g. GeoLite2CityLookup
Return type:str
getWikidataCityPopulation(sqlDB, endpoint=None)[source]
  • sqlDB (SQLDB) – target SQL database
  • endpoint (str) – url of the wikidata endpoint or None if default should be used

check if the given string is an ISO code

Returns:True if the string is an ISO Code
Return type:bool

check if the given string name is a country

Parameters:name (string) – the string to check
Returns:if pycountry thinks the string is a country
Return type:True

locate a city, region country combination based on the given wordtoken information

  • places (list) – a list of places derived by splitting a locality e.g. “San Francisco, CA”
  • to "San Francisco", "CA" (leads) –

a city with country and region details

Return type:


locator = None
places_by_name(placeName, columnName)[source]

get places by name and column :param placeName: the name of the place :type placeName: string :param columnName: the column to look at :type columnName: string


populate countries and regions from Wikidata

Parameters:sqlDB (SQLDB) – target SQL database

populate the given sqlDB with the Geolite2 Cities

Parameters:sqlDB (SQLDB) – the SQL database to use

populate the given sqlDB with the Wikidata Cities

Parameters:sqlDB (SQLDB) – target SQL database

populate database with countries from wikiData

Parameters:sqlDB (SQLDB) – target SQL database

populate database with regions from wikiData

Parameters:sqlDB (SQLDB) – target SQL database

populate the version table

Parameters:sqlDB (SQLDB) – target SQL database

populate the cities SQL database which caches the information from the GeoLite2-City-Locations.csv file

Parameters:force (bool) – if True force a recreation of the database

recreate my lookup database


get the regions for the given region_name (which might be an ISO code)

Parameters:region_name (string) – region name
Returns:the list of cities for this region
Return type:list
static resetInstance()[source]
class geograpy.locator.Region[source]

Bases: object

a Region (Subdivision)

static fromGeoLite2(record)[source]

create a region from a Geolite2 record

Parameters:record (dict) – the records as returned from a Query
Returns:the corresponding region information
Return type:Region
static fromWikidata(record)[source]

create a region from a Wikidata record

Parameters:record (dict) – the records as returned from a Query
Returns:the corresponding region information
Return type:Region

main program.

geograpy.places module

class geograpy.places.PlaceContext(place_names, setAll=True)[source]

Bases: geograpy.locator.Locator

Adds context information to a place name


Set all context information


set the cities information


get the country information from my places


geograpy.prefixtree module

geograpy.utils module

geograpy.utils.fuzzy_match(s1, s2, max_dist=0.8)[source]

Fuzzy match the given two strings with the given maximum distance :param s1: string: First string :param s2: string: Second string :param max_dist: float: The distance - default: 0.8

Returns:jellyfish jaro_winkler_similarity based on
Return type:float

Remove non ascii chars from the given string :param s: string: The string to remove chars from

Returns:The result string with non-ascii chars removed
Return type:string

Hat tip:

geograpy.wikidata module

Created on 2020-09-23

@author: wf

class geograpy.wikidata.Wikidata(endpoint='')[source]

Bases: object

Wikidata access

getCities(region=None, country=None)[source]

get the cities from Wikidata


get the city populations from Wikidata

Parameters:profile (bool) – if True show profiling information

get a list of countries

try query


get Regions from Wikidata

try query

Module contents

main geograpy 3 module

geograpy.get_geoPlace_context(url=None, text=None, debug=False)[source]

Get a place context for a given text with information about country, region, city and other based on NLTK Named Entities having the Geographic(GPE) label.

  • url (String) – the url to read text from (if any)
  • text (String) – the text to analyze
  • debug (boolean) – if True show debug information

PlaceContext: the place context

Return type:


geograpy.get_place_context(url=None, text=None, labels=['GPE', 'GSP', 'PERSON', 'ORGANIZATION'], debug=False)[source]

Get a place context for a given text with information about country, region, city and other based on NLTK Named Entities in the label set Geographic(GPE), Person(PERSON) and Organization(ORGANIZATION).

  • url (String) – the url to read text from (if any)
  • text (String) – the text to analyze
  • debug (boolean) – if True show debug information

PlaceContext: the place context

Return type:


geograpy.locateCity(location, correctMisspelling=False, debug=False)[source]

locate the given location string :param location: the description of the location :type location: string

Returns:the location
Return type:Locator