On the surface, most people wouldn’t view identity and access management as a data problem. “Give me access to [some application] right now” is usually top of mind, followed by spite about access reviews, and possibly fear of audit issues, among other things. All of those are reasonable thoughts, but under the surface lies a big, nasty data problem.
Suppose you are fortunate (or unfortunate) enough to have a centralized identity management system that helps you manage access reviews. What data needs to come together to give you the information you need to perform an access review?
Aggregating data from dozens of systems
User and entitlement data needs to be aggregated for X number of applications, which involves querying X number of databases, directories, mainframe VSAM files, or… oh, that hosted SaaS application with no API that only gives access reports in canned PDF files. By the way, this data can’t be aggregated just once–your auditor wants it to be recent, and real time synchronization is the most accurate. Tough start for our quest of complete and accurate information to review.
Matching data for thousands of records
User data needs to be matched with active employees from HR. This means yet another query of employee data from the HR system. Incidentally, this information is sensitive and needs to be heavily scoped before you can obtain it. Beyond that, the user data in your queries from X number of applications comes in any number of formats. John Doe in one system is jdoe004 in another system, firstname.lastname@example.org in the next, DOEJ0004 on your mainframe, and so on. This can become quite the mapping exercise.
Codifying knowledge about thousands of data elements
Just showing people user and entitlement data without telling them what it means isn’t helpful. As much as you probably deserve a reprieve for accomplishing the first two steps, the people reviewing and auditing the access are not likely to let you off the hook willingly. You need to collect knowledge about your application roles and entitlements from dozens of technical staff and thousand page application security manuals, store the knowledge, and present it back to the appropriate reviewers during the access review process. The IAM data problem wouldn’t be complete without some unstructured data and knowledge management.
Storing responses from multiple users
After you’ve successfully presented aggregated, matched, and codified user and entitlement data to your access reviewers, you then need to be prepared to catch and store all of the information they generate from the access review process. That’s right, for all of the records you queried and collected, you will be receiving one or more pieces of data from the access reviewers. They’re telling you the all-important information that comes from the access review process: should the access be kept, changed, or removed? Finally, all of your hard data work is paying off… except you’re not done yet.
Reporting data to different audiences in a variety of formats
Once all of the access review data has been collected, the job isn’t done until the results have been given to everyone who needs them. Security administrators need actionable information about access changes and removals so they can make the corresponding changes in their applications and platforms. By the way, bonus points if you can give this information to their applications directly so that the actions can be automated. Your auditors also need this data, except they need it to be neatly formatted in a report that shows who reviewed and approved what and when. Your access review data management job is now done… until next quarter.
In some domains, any one of these steps in isolation would be a difficult problem. IAM is frequently faced with all of these challenges, and access review is only one example. Better identity and access management also means better data management.