Op-Ed: 'Dirty data' search ranking people on Facebook for jobs?

Posted May 9, 2013 by Paul Wallis
A company called Identified has created an SEO option for recruiters to search Facebook for job candidates. This is “head hunting SEO” in embryo, and it's called SYMAN. It looks like Identified is underrating the value of their idea.
Sourcing Tool for Facebook powered by SYMAN
Sourcing Tool for Facebook powered by SYMAN
Identified, Inc.
The search environment for this idea is the first, gigantic, problem. Social media marketing is now extremely important, but there was a very long, pricey learning curve. The fizzy hype of social media in the past resulted in some very expensive impractical marketing before the experts came in and started creating working efficiencies.
The jobs market, in its ritualistic and ponderous way, embraced social media with all the grace of an elephant dancing with an egg. The result was some pretty lousy omelets and huge job search engines with all the agility of a house brick. Search efficiency varies from OK to great to absurd.
Identified can be credited with real guts for taking on this very difficult idea. The theory is that SYMAN cleans up the “dirty data” (redundant expressions, multiple ways of saying the same thing, references to qualifications, etc. scattered across Facebook) and finds people who match recruiters’ needs and ranking them in order of meeting recruitment criteria.
Grappling with a monster
So far so good, but ranking people on this basis is likely to be tricky, right? The idea is to find talent, skills, and what is basically an “implied resume”. I spent years writing on job sites in Europe and America, and I can tell you that SYMAN is working with a very strange cultural beast.
The sheer amount of processing in job candidacy screening is staggering. When you’re talking about recruitment criteria, the tendency is to add more and more. The theory is that the additions add filters, the result is a mess of epic size and dubious efficiency.
So- Add criteria to SYMAN, and what do you get? Arguably, you get better results, but in practice, you also get complexity, exactly what SYMAN is trying to reduce to something useful. The problem for Identified is that recruiters, naturally, think like recruiters, even when they and the search engine are on the same page. Add to this the usual slapdash nature of Facebook entries and profiles, and you get a real challenge at both ends of the search, search criteria and available data.
Tech Crunch:
…Identified was met with the challenge of having to convince and incentivize the younger generations to claim and fill out their Facebook-derived profiles on its platform — not such an easy task when so many people already have a litany of online profiles to manage.
But doing so is critical for Identified, because without users claiming their profiles and adding more data, an Identified profile isn’t worth much more than any other. Plus, it means the company has fewer data points with which to work when trying to assign an accurate score or effectively tracking a user’s career progress.
A much bigger and better idea?
Yes, SYMAN is tracking moving targets with varying degrees of quality of information and searchability. Its results depend on finding useful elements in the usual mess of a Facebook profile. However, SYMAN has a trick it may not be using. The methodology may be infinitely more valuable than the basic function.
This is a very new search dynamic. Google tends to search static identities and uses keywords, links, relevance and references to pin down and rank search results. It does this with formalized SEO criteria and well-understood rules for searches.
The actual efficiency of searches for users, however, is highly debatable for both experts and consumers. Everyone’s seen searches which don’t even relate to what they want. “Dirty data” is a real problem for searches, adding irrelevance on a huge scale, and that’s where SYMAN may have hit the jackpot.
Cleaning up data and defining relevancies is both obvious and necessary. Searching common terms can be horrific. Search the word “engineer” on Monster, and you’ll be instantly confronted with absolutely useless mountains of results unless you add very specific criteria. For aeronautics systems engineers, for example, you may be better off putting in specific system types into your job search rather than even using the word “engineer”.
SYMAN can apparently create a multi-tiered frame of reference for a search, eliminating synonyms and adding qualifiers, removing ambiguous search elements and adding values.
Search dynamics and SYMAN
Now, given all the above- Imagine what would happen if Google did that. Useful to the world? Incredibly so. Useful to advertisers and Google itself? Very.
Search ranking could also receive a new asset. SYMAN’s ability to search variable information quality could produce multiple rankings, at least in theory.
For instance, you want:
An aeronautics engineer
Someone familiar with Microsoft Aero Widget Systems
Someone with a Stanford/Harvard/Princeton scale education background
Experience with military systems
Experience in intellectual property values and research
See any options for rankings here? You can have values for those who meet all these criteria, but you can also have separate weightings for those who fit specifically valued criteria. In effect, you get 5 rankings options, not just one, within the same search.
In the example above, if “military systems” is the absolute essential, you rank your candidates according to that criterion more highly than the others. This could be applied even more effectively to relatively “clean”, better structured SEO-friendly information, helping everyone online.
SYMAN appears to have the natural ability to make these distinctions within its search context. I dread to think how much sheer hard work must have gone in to creating the “dirty data” cleaning operations, but Identified may have done itself a big favor in the process.
By focusing on the incredible mass of information in a truly thankless area of searching like the recruitment industry and jobs market, it’s apparently found a real streak of gold. Believe me, most job ad writers aren’t exactly SEO writers, and most people who write their own or other people’s profiles are much worse. Searchable in theory, yes. Searchable to the point of being ultra-efficient, nothing like.
If SYMAN can make sense of that sort of half-digestible data and get reliable, quality-driven rankings, it deserves a good hard look from the internet giants. There looks to be an additional role in Big Data, managing the disparate data that seems to fly in huge flocks under the radar of those systems, too.
Special note: Also check out how Identified ran the business model for this operation. They not only got funding, about $21 million, but managed to get this extraordinary product into a functional form. (One example cited for the recruitment product demo is finding Registered Nurses in Miami. I’ve done searches like that in the US healthcare sector, and it’s a very tough task. SYMAN can obviously deliver in its stated commercial role, and that’s no minor achievement.)