Software Engineering, Programming

How Google, Facebook, Quora and Medium applies this 50 years old data structure for efficient search

Have you ever wondered how big companies like Google and Facebook can check so quickly if a given username is already taken in their massive databases? That’s today’s topic of our article! You will learn how a data structure called Bloom Filter is used for efficient filtering.

Definition

Bloom filter was invented in 1970 by Burton Howard Bloom. It is a probabilistic data structure with the main goal of testing if an element is a member of a set. The results returned by the bloom…


Data Governance, Data Protection

Microsoft’s SDK for Data Protection and Anonymization

The phrase “Data is the new oil” became popular after The Economist published a story titled, “The world’s most valuable resource is no longer oil, but data.”, back in 2017.

If we are making an analogy between data and oil, data breaches can have devastating effects similarly to how oil spills have on the environment.

A data breach is a confirmed incident in which sensitive, confidential or otherwise protected data has been accessed and/or disclosed in an unauthorized fashion. Data breaches may involve personal health information (PHI), personally identifiable information (PII), trade secrets or intellectual property.

Source: https://searchsecurity.techtarget.com/definition/data-breach

Famous cases of data breach

Adobe: In…


Data Engineering

Simplifying data infrastructure and accelerating innovation

The history of data storage starts back in the 1950s when punch cards were used for storing data generated by computers. A lot has changed since then and this article will cover one of the latest trends in the industry, Lakehouse.


Data Science, Machine Learning

Improving the fairness of machine learning models.

We have witnessed rapid advances in machine learning in the past few years. New technologies have shown dramatic improvements in technical performance and more companies are relying on A.I. for their decision-making process or using it as part of their products.

However, along with these advances, unfair and discriminatory results have increased at the same pace. In this article, we’ll cover some of these issues and the possible solutions to minimize them.

In this article, you’ll read about:

  • The trouble with bias
  • What’s responsible ML?
  • What’s Fairlearn?
  • Defining fairness in A.I.
  • What are the Use Cases?

The trouble with bias

The concern about bias…


O setor de tecnologia sempre foi dominado por homens mas uma rápida pesquisa no Google e encontramos iniciativas como a ONG http://mulheresnatecnologia.org, o blog http://mulheres.eti.br e até mesmo eventos como o http://www.mulherestechday.com.br buscando mudar essa tendência. O número de mulheres em cursos que antes eram considerados tipicamente masculinos vem crescendo mas ainda há muito a ser feito. É notório que ainda existem consideráveis diferenças entre a média salarial masculina e feminina. Esse fato também ocorre quando elas começam a empreender e é sobre isso que falo nesse artigo.

A Bloomberg realizou uma pesquisa concentrando-se em 2.005 fundadores de startups do…

Bruno Cordeiro

Data scientist, writer, traveler & coffee addict. #Machine Learning #Open-Source #Company Culture

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store