Let’s say you dream about building your own fishing cabin along the lake. In your garage, you find a bunch of tools, a pile of wooden planks, various nails, and screws. Would you rather:
a) Start immediately to build the walls, without checking if you have enough nails, a proper hammer or even if you have enough planks. Who cares? You will figure this out eventually in the middle of the process. b) Or, take a moment to check your set of tools, sort the planks by size and state, perhaps dropping the rusty nails and going to the shop…
Our (love-)story with Spark started when the company decided to venture into the exciting world of AI. Very quickly, we came across the Databricks platform and, after exploring it during a two weeks POC, we were ready for the rumble.
From a Data Scientist perspective:
“The platform simplifies the Spark environment settings and offered collaborative workspaces for exploration, visualization and models deployment at the light speed. ”.
For help to get started with Databricks, check out this article:
Started to work on Spark, can feel like we (almost) have to learn a new language: PySpark. It is close enough to…
Beyond the sarcasm of this quote, there is a reality: of all the statistical techniques, regression analysis is often referred to as one of the most significant in business analysis. Most companies use regression analysis to explain a particular phenomenon, build forecasting or to make predictions. These new insights can be extremely valuable in understanding what can make a difference in the business.
When you work as a Data Scientist, building a linear regression model can sound pretty dull especially when it’s all about AI around you. However, I want to stress that mastering the main assumptions of linear regression…
Indexes are potent indicators for global and country-specific economies. They are massively used by governments and traders to formulate economic policies, refine external trade and measure changes in money value.
Indexes are meant to reflect changes in a variable (or group of variables) regarding, for example, time or geographical location. That’s why they are commonly used to compare the levels of a phenomenon on a certain date with its level on a previous date or the levels of a phenomenon at different places on the same date.
ex. The price of oil in March 2020 compared to the price in…
8D music sounds like we are talking about a new kind of audio technology coming straight from the future. In reality, this technology along with ambisonics existed since the ’70s, but never really took off.
But all of a sudden, a message promoting the new music of the Pentatonix composed with 8D technology got viral on WhatsApp. Maybe you already received it from some friends, if not yet, don’t worry it will come:
According to the official documentation: “A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement.”
In other words, Random Forest is a powerful, yet relatively simple, data mining and supervised machine learning technique. It allows quick and automatic identification of relevant information from extremely large datasets. The biggest strength of the algorithm is that it relies on the…
Analytic Translator | AI/ML & Statistics Player | Unlock Business Opportunities ✅𝗵𝘁𝘁𝗽𝘀://𝘄𝘄𝘄.𝗹𝗶𝗻𝗸𝗲𝗱𝗶𝗻.𝗰𝗼𝗺/𝗶𝗻/𝗮𝘂𝗿𝗲𝗹𝗶𝗲𝗴𝗶𝗿𝗮𝘂𝗱