Welcome to geograpy3’s documentation!

geograpy package

Submodules

geograpy.extraction module

class geograpy.extraction.Extractor(text=None, url=None, debug=False)[source]

Bases: object

Extract geo context for text or from url

find_entities(labels=['GPE', 'GSP', 'PERSON', 'ORGANIZATION'])[source]

Find entities with the given labels set self.places and returns it :param labels: Labels: The labels to filter

Returns:List of places
Return type:list
find_geoEntities()[source]

Find geographic entities

Returns:List of places
Return type:list
set_text()[source]

Setter for text

split(delimiter=', ')[source]

simpler regular expression splitter with not entity check

hat tip: https://stackoverflow.com/a/1059601/1497139

geograpy.labels module

Created on 2020-09-10

@author: wf

class geograpy.labels.Labels[source]

Bases: object

NLTK labels

default = ['GPE', 'GSP', 'PERSON', 'ORGANIZATION']
geo = ['GPE', 'GSP']

geograpy.locator module

The locator module allows to get detailed city information including the region and country of a city from a location string.

Examples for location strings are:

Amsterdam, Netherlands Vienna, Austria Vienna, IL Paris - Texas Paris TX

the locator will lookup the cities and try to disambiguate the result based on the country or region information found.

The results in string representationa are:

Amsterdam (NH(North Holland) - NL(Netherlands)) Vienna (9(Vienna) - AT(Austria)) Vienna (IL(Illinois) - US(United States)) Paris (TX(Texas) - US(United States)) Paris (TX(Texas) - US(United States))

Each city returned has a city.region and city.country attribute with the details of the city.

Created on 2020-09-18

@author: wf

class geograpy.locator.City[source]

Bases: object

a single city as an object

static fromGeoLite2(record)[source]
setValue(name, record)[source]

set a field value with the given name to the given record dicts corresponding entry or none

Parameters:
  • name (string) – the name of the field
  • record (dict) – the dict to get the value from
class geograpy.locator.Country[source]

Bases: object

a country

static fromGeoLite2(record)[source]

create a country from a geolite2 record

static fromPyCountry(pcountry)[source]
Parameters:pcountry (PyCountry) – a country as gotten from pycountry
Returns:the country
Return type:Country
class geograpy.locator.Locator(db_file=None, correctMisspelling=False, debug=False)[source]

Bases: object

location handling

cities_for_name(cityName)[source]

find cities with the given cityName

Parameters:cityName (string) – the potential name of a city
Returns:a list of city records
correct_country_misspelling(name)[source]

correct potential misspellings :param name: the name of the country potentially misspelled :type name: string

Returns:correct name of unchanged
Return type:string
createViews(sqlDB)[source]
db_has_data()[source]

check whether the database has data / is populated

Returns:True if the cities table exists and has more than one record
Return type:boolean
db_recordCount(tableList, tableName)[source]

count the number of records for the given tableName

Parameters:
  • tableList (list) – the list of table to check
  • tableName (str) – the name of the table to check
Returns
int: the number of records found for the table
disambiguate(country, regions, cities, byPopulation=True)[source]

try determining country, regions and city from the potential choices

Parameters:
  • country (Country) – a matching country found
  • regions (list) – a list of matching Regions found
  • cities (list) – a list of matching cities found
Returns:

the found city or None

Return type:

City

getAliases()[source]

get the aliases hashTable

getCountry(name)[source]

get the country for the given name :param name: the name of the country to lookup :type name: string

Returns:the country if one was found or None if not
Return type:country
getGeolite2Cities()[source]

get the Geolite2 City-Locations as a list of Dicts

Returns:a list of Geolite2 City-Locator dicts
Return type:list
static getInstance(correctMisspelling=False, debug=False)[source]

get the singleton instance of the Locator. If parameters are changed on further calls the initial parameters will still be in effect since the original instance will be returned!

Parameters:
  • correctMispelling (bool) – if True correct typical misspellings
  • debug (bool) – if True show debug information
getView()[source]

get the view to be used

Returns:the SQL view to be used for CityLookups e.g. GeoLite2CityLookup
Return type:str
getWikidataCityPopulation(sqlDB, endpoint=None)[source]
Parameters:
  • sqlDB (SQLDB) – target SQL database
  • endpoint (str) – url of the wikidata endpoint or None if default should be used
isISO(s)[source]

check if the given string is an ISO code

Returns:True if the string is an ISO Code
Return type:bool
is_a_country(name)[source]

check if the given string name is a country

Parameters:name (string) – the string to check
Returns:if pycountry thinks the string is a country
Return type:True
locateCity(places)[source]

locate a city, region country combination based on the given wordtoken information

Parameters:
  • places (list) – a list of places derived by splitting a locality e.g. “San Francisco, CA”
  • to "San Francisco", "CA" (leads) –
Returns:

a city with country and region details

Return type:

City

locator = None
places_by_name(placeName, columnName)[source]

get places by name and column :param placeName: the name of the place :type placeName: string :param columnName: the column to look at :type columnName: string

populateFromWikidata(sqlDB)[source]

populate countries and regions from Wikidata

Parameters:sqlDB (SQLDB) – target SQL database
populate_Cities(sqlDB)[source]

populate the given sqlDB with the Geolite2 Cities

Parameters:sqlDB (SQLDB) – the SQL database to use
populate_Cities_FromWikidata(sqlDB)[source]

populate the given sqlDB with the Wikidata Cities

Parameters:sqlDB (SQLDB) – target SQL database
populate_Countries(sqlDB)[source]

populate database with countries from wikiData

Parameters:sqlDB (SQLDB) – target SQL database
populate_Regions(sqlDB)[source]

populate database with regions from wikiData

Parameters:sqlDB (SQLDB) – target SQL database
populate_Version(sqlDB)[source]

populate the version table

Parameters:sqlDB (SQLDB) – target SQL database
populate_db(force=False)[source]

populate the cities SQL database which caches the information from the GeoLite2-City-Locations.csv file

Parameters:force (bool) – if True force a recreation of the database
readCSV(fileName)[source]
recreateDatabase()[source]

recreate my lookup database

regions_for_name(region_name)[source]

get the regions for the given region_name (which might be an ISO code)

Parameters:region_name (string) – region name
Returns:the list of cities for this region
Return type:list
static resetInstance()[source]
class geograpy.locator.Region[source]

Bases: object

a Region (Subdivision)

static fromGeoLite2(record)[source]

create a region from a Geolite2 record

Parameters:record (dict) – the records as returned from a Query
Returns:the corresponding region information
Return type:Region
static fromWikidata(record)[source]

create a region from a Wikidata record

Parameters:record (dict) – the records as returned from a Query
Returns:the corresponding region information
Return type:Region
geograpy.locator.main(argv=None)[source]

main program.

geograpy.places module

class geograpy.places.PlaceContext(place_names, setAll=True)[source]

Bases: geograpy.locator.Locator

Adds context information to a place name

get_region_names(country_name)[source]
setAll()[source]

Set all context information

set_cities()[source]

set the cities information

set_countries()[source]

get the country information from my places

set_other()[source]
set_regions()[source]

geograpy.prefixtree module

geograpy.utils module

geograpy.utils.fuzzy_match(s1, s2, max_dist=0.8)[source]

Fuzzy match the given two strings with the given maximum distance :param s1: string: First string :param s2: string: Second string :param max_dist: float: The distance - default: 0.8

Returns:jellyfish jaro_winkler_similarity based on https://en.wikipedia.org/wiki/Jaro-Winkler_distance
Return type:float
geograpy.utils.remove_non_ascii(s)[source]

Remove non ascii chars from the given string :param s: string: The string to remove chars from

Returns:The result string with non-ascii chars removed
Return type:string

Hat tip: http://stackoverflow.com/a/1342373/2367526

geograpy.wikidata module

Created on 2020-09-23

@author: wf

class geograpy.wikidata.Wikidata(endpoint='https://query.wikidata.org/sparql')[source]

Bases: object

Wikidata access

getCities(region=None, country=None)[source]

get the cities from Wikidata

getCityPopulations(profile=True)[source]

get the city populations from Wikidata

Parameters:profile (bool) – if True show profiling information
getCountries()[source]

get a list of countries

try query

getRegions()[source]

get Regions from Wikidata

try query

Module contents

main geograpy 3 module

geograpy.get_geoPlace_context(url=None, text=None, debug=False)[source]

Get a place context for a given text with information about country, region, city and other based on NLTK Named Entities having the Geographic(GPE) label.

Parameters:
  • url (String) – the url to read text from (if any)
  • text (String) – the text to analyze
  • debug (boolean) – if True show debug information
Returns:

PlaceContext: the place context

Return type:

places

geograpy.get_place_context(url=None, text=None, labels=['GPE', 'GSP', 'PERSON', 'ORGANIZATION'], debug=False)[source]

Get a place context for a given text with information about country, region, city and other based on NLTK Named Entities in the label set Geographic(GPE), Person(PERSON) and Organization(ORGANIZATION).

Parameters:
  • url (String) – the url to read text from (if any)
  • text (String) – the text to analyze
  • debug (boolean) – if True show debug information
Returns:

PlaceContext: the place context

Return type:

pc

geograpy.locateCity(location, correctMisspelling=False, debug=False)[source]

locate the given location string :param location: the description of the location :type location: string

Returns:the location
Return type:Locator

setup module

tests package

Submodules

tests.test_extractor module

class tests.test_extractor.TestExtractor(methodName='runTest')[source]

Bases: unittest.case.TestCase

test Extractor

check(places, expectedList)[source]

check the places for begin non empty and having at least the expected List of elements

Parameters:
  • places (Places) – the places to check
  • expectedList (list) – the list of elements to check
setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

testExtractorFromText()[source]

test different texts for getting geo context information

testExtractorFromUrl()[source]

test the extractor

testGeograpyIssue32()[source]

test https://github.com/ushahidi/geograpy/issues/32

testGetGeoPlace()[source]

test geo place handling

testIssue10()[source]

test https://github.com/somnathrakshit/geograpy3/issues/10 Add ISO country code

testIssue7()[source]

test https://github.com/somnathrakshit/geograpy3/issues/7 disambiguating countries

testIssue9()[source]

test https://github.com/somnathrakshit/geograpy3/issues/9 [BUG]AttributeError: ‘NoneType’ object has no attribute ‘name’ on “Pristina, Kosovo”

testStackOverflow54721435()[source]

see https://stackoverflow.com/questions/54721435/unable-to-extract-city-names-from-a-text-using-geograpypython

testStackoverflow43322567()[source]

see https://stackoverflow.com/questions/43322567

testStackoverflow54077973()[source]

see https://stackoverflow.com/questions/54077973/geograpy3-library-for-extracting-the-locations-in-the-text-gives-unicodedecodee

testStackoverflow54712198()[source]

see https://stackoverflow.com/questions/54712198/not-only-extracting-places-from-a-text-but-also-other-names-in-geograpypython

testStackoverflow55548116()[source]

see https://stackoverflow.com/questions/55548116/geograpy3-library-is-not-working-properly-and-give-traceback-error

testStackoverflow62152428()[source]

see https://stackoverflow.com/questions/62152428/extracting-country-information-from-description-using-geograpy?noredirect=1#comment112899776_62152428

tests.test_locator module

Created on 2020-09-19

@author: wf

class tests.test_locator.TestLocator(methodName='runTest')[source]

Bases: unittest.case.TestCase

test the Locator class from the location module

checkExamples(examples, countries, debug=False, check=True)[source]

check that the given example give results in the given countries :param examples: a list of example location strings :type examples: list :param countries: a list of expected country iso codes :type countries: list

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

testDelimiters()[source]

test the delimiter statistics for names

testExamples()[source]

test examples

testGeolite2Cities()[source]

test the locs.db cache for cities

testHasData()[source]

check has data and populate functionality

testIsoRegexp()[source]

test regular expression for iso codes

testIssue15()[source]

https://github.com/somnathrakshit/geograpy3/issues/15 test Issue 15 Disambiguate via population, gdp data

testIssue17()[source]

test issue 17:

https://github.com/somnathrakshit/geograpy3/issues/17

[BUG] San Francisco, USA and Auckland, New Zealand should be locatable #17

testIssue19()[source]

test issue 19

testPopulation()[source]

test adding population data from wikidata to GeoLite2 information

testWordCount()[source]

test the word count

tests.test_places module

class tests.test_places.TestPlaces(methodName='runTest')[source]

Bases: unittest.case.TestCase

test Places

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

testPlaces()[source]

test places

tests.test_prefixtree module

tests.test_wikidata module

Created on 2020-09-23

@author: wf

class tests.test_wikidata.TestWikidata(methodName='runTest')[source]

Bases: unittest.case.TestCase

test the wikidata access for cities

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

testLocatorWithWikiData()[source]

test Locator

testWikidataCities()[source]
test getting city information from wikidata

1372 Singapore 749 Beijing, China 704 Paris, France 649 Barcelona, Spain 625 Rome, Italy 616 Hong Kong 575 Bangkok, Thailand 502 Vienna, Austria 497 Athens, Greece 483 Shanghai, China

testWikidataCountries()[source]

test getting country information from wikidata

Module contents

Indices and tables