Suppose you have a very large dataset - far too large to hold in memory - with duplicate entries. You want to know how many duplicate entries, but your data isn't sorted, and it's big enough that sorting and counting is impractical. How do you estimate how many unique entries the dataset contains?
As recently as this week, I’ve been involved in conversations with customers about how we can help make their teams deliver more predictably. How can they meet commitments on all levels of the organization, including project, program, and portfolio?
Github has a nice API for inspecting repositories – it lets you read gists, issues, commit history, files and so on. Git repository data lends itself to demonstrating the power of combining full text and faceted search...
During our induction into the IBM family, one of our new colleagues told an anecdote about a firm that outsourced its mobile application development. Managing the relationship of outsourced work with what is being developed in house is a challenge similar to what manufacturers face with their supply chains.
We need a way to match queries to entities in our Postgres database. At first, this might seem like a simple problem with a simple solution, especially if you’re using the ORM; just jam the user input into an ORM filter and retrieve every matching string. But there’s a problem.