For specific case where I am looking into import - of website tags for start - is Empik in Poland.
See empik_pl website import candidates (entire website is work in progress, experimental, use on own risk. I may filter out more data and slap more disclaimers on it - let me know if anything like that should be changed.)
Looking at say Salon Empik - Wrocław Zwycięska may ring alarm bells as page has
<iframe width="100%" height="100%" style="border: 0px;" loading="lazy" allowfullscreen="" src="https://www.google.com/maps/embed/v1/place?key=AIzaSyBl5C-uONKhzn9Nmn3DJXAP42tYWejo5YU&q=Empik,Wrocław Zwycięska,ul. Zwycięska 33"></iframe>
and as I understand it displays location on Google Maps, with geolocation done by Google Maps.
I definitely would not want to do imports based on addresses geolocated with Google Maps. And I would proceed to revert any spotted.
But… Spider is defined at alltheplaces/locations/spiders/empik_pl.py at 350fd8686ccf52039969ccaf39317bbae22fc2fb · alltheplaces/alltheplaces · GitHub and it actually pulls data from Empik itself.
It can be replicated with following Python script:
import rich
import json
import requests
response = Request(
method="POST",
url="https://www.empik.com/ajax/delivery-point/empik?query=",
headers={},
cookies={"CSRF": "42adc778-4158-4646-8ca9-e97ce140da75"},
)
response = requests.post("https://www.empik.com/ajax/delivery-point/empik?query=", headers={"X-CSRF-TOKEN": "42adc778-4158-4646-8ca9-e97ce140da75"}, cookies={"CSRF": "42adc778-4158-4646-8ca9-e97ce140da75"})
try:
data = response.json()
rich.print(data)
except json.JSONDecodeError:
print("Failed to decode JSON:", response.text)
and has entries like
{
'id': 123,
'deliveryPointType': 11,
'city': 'Żywiec',
'name': 'Żywiec Lider (SP)',
'address': 'ul. Zielona 3',
'phone': '695550266',
'faxNumber': None,
'cellPhone': None,
'email': 'zywiec.lider@empik.com',
'postCode': '34-300',
'mondayWorkingHours': '9:00-20:00',
'tuesdayWorkingHours': '9:00-20:00',
'wednesdayWorkingHours': '9:00-20:00',
'thursdayWorkingHours': '9:00-20:00',
'fridayWorkingHours': '9:00-20:00',
'saturdayWorkingHours': '9:00-20:00',
'sundayWorkingHours': '10:00-18:00',
'phoneVisible': True,
'longitude': 19.2006465,
'latitude': 49.6909741,
'lastUsedDate': None,
'blockedTemporarily': False,
'temporarilyBlockedForSmallGauge': False,
'temporarilyBlockedForAverageGauge': False,
'manuallyBlockedByBusiness': False,
'empikStoreType': 1,
'storePage': '/salony-empik/zywiec/zywiec-lider-sp,123,e',
'closed': False,
'default': False
}
This data looks safe to use for me.