This audit utilizes a two-pass classification system to identify Flock Safety ALPR nodes within the database. This isn’t an ‘automated’ process, each node will need to be confirmed. One of the steps performed when gathering the data, was adding State and County, to the local database. I don’t know that this information would be worth adding to OSM, but it helps to break the amount of work down to manageable chunks. My thought is to go county by county. Each county will be a separate changeset.
While I have made edits to OSM, this is by far the largest project I have considered. I would like any tips, feedback, or guidance.
Phase 1: Automated Verification (The Gold Standard)
Matches nodes that exactly meet the following key-value schema:
man_made: surveillance
surveillance:type: ALPR
manufacturer: Flock Safety
manufacturer:wikidata: Q108485435
Phase 2: Heuristic Discovery
Identifies needs_review nodes using a wildcard search (%floc%) across the operator field and the full JSON tag bucket.
II. Executive Statistics
Metric
Value
Total Surveillance Nodes Analyzed
97,123
Verified ‘Gold Standard’ Nodes
50,622
‘Discovery’ Queue (Needs Review)
8,565
Current Database Compliance
85.5%
III. Key-Value Convention Analysis
Tagging Convention
Entire Database
Verified Nodes
Needs Review
direction=
66,454
49,621
6,015
camera:direction=
9,387
643
1,736
manufacturer:wikidata=
56,763
50,622
383
Recommendation: The bulk change should inject manufacturer:wikidata=Q115167664 to align with OSM best practices for branded infrastructure.
IV. Geographic Cleanup Priorities
Counties with the highest density of non-standard nodes:
Alameda County, California | 971 nodes
Harris County, Georgia | 385 nodes
Los Angeles County, California | 178 nodes
Riverside County, California | 166 nodes
Tulsa County, Oklahoma | 149 nodes
Jefferson County, Alabama | 129 nodes
Monterey County, California | 126 nodes
Cuyahoga County, Ohio | 124 nodes
Sangamon County, Illinois | 123 nodes
Orange County, California | 121 nodes
V. Data Variance Analysis
Common deviations found in the ‘Discovery’ queue requiring remediation:
Sounds good. You might want to propose a change or start a discussion rather than only posting an LLM-generated report. What changes are you proposing? How will you do it? Who’s making the changes?
I posted all the finding hoping for some guidance from the people that have more experience than me. Because of the amount of variations in the data, I don’t know where the best place to start is. Then I got to thinking about apps that pull from OSM, and didn’t want to break something if the app depended on certain tags. I thought I was doing a good thing by providing as much information as possible. The report was actually generated from a Jupyter Notebook, I gave it to an LLM to format it to make it easier to read. Most LLM’s would choke on a +97,000 dataset..lol
The report is using a lot of novel terminology that won’t be familiar to any of the more experienced mappers here. What does “gold standard”, “verified”, or “discovery” mean in this context? What does remediation entail for a surveillance camera node that isn’t related to Flock?
What tools are you familiar with for editing OSM? Some tag replacements can be carried out reliably in JOSM, while others need manual review that can benefit from crowdsourcing on MapRoulette.
You are correct. I provided to much information and not enough details for a meaningful discussion on what my goals and methods would be. I’ll just call this post what it was “an audit of variations i found in tagging conventions”. I will make another post that will be more mindful of terminology, and layout my goal, and method in a more appropriate manner. Thanks everyone for the feedback.
If there’s any audit or automated update done with respect to Flock stuff, it would be really good to audit multiple instances placed in one location.
Sure, there can and will be multiple cameras placed by each other, but I routinely clear notes asking for one to be added… upon multiple that exist, sometimes pointing in the same direction. Some people place them right by the road, some place them way off, etc. These “deflock” tools lack any QA.
You kinda nailed why that would be nearly impossible to do with any automated algorithm. Near me I have an intersection than has 4 Flock Falcons, 1 Flock Condor, 2 Motorola LPR, and right next to the intersection is a RedSpeed LPR for a school zone. Even if all the tags were perfect, it would still need boots on the ground to confirm what is what.
I also do see the same thing you mention, where the same Flock Falcon has 3 nodes over about a 1/4 mile. But I couldn’t know for sure they were duplicates till I drove by to make sure there maybe wasn’t one or more on the otherside of the 4 lane divided highway.
I’ve removed several duplicate nodes and have about 6 more, just in one county out of the 67 counties in Florida. I can’t imagine how many duplicates across the more than +3000 county equivalent areas in the lower 48 states.
I approve of automated edits standardizing manufacturer= into manufacturer:wikidata=, as well as adding operator:wikidata= to all the existing operators.
Obviously this should not be part of the group.
These should be changed to camera;ALPR or the flipped version.
These are bogus and should be deleted for resurvey.