A list of concepts and mental models, explained in very simple terms, that help improve critical thinking and
decision-making:
Pareto Principle π
Approximately 80% of effects stem from 20% of causes, implying that a small number
of inputs yield a majority of the results.
Regret Minimization Framework π
Imagine yourself in the future, reflecting on your current decision. Choose the
option that minimizes potential regret.
This concept ties in nicely with Problem Inversion: the practice
of thinking through problems in reverse.
Forced serendipity βοΈ
Consistent hard work and dedication increase the likelihood of encountering
fortunate events. "The harder I work, the luckier I seem to get."
Principal-Agent Problem π
A disagreement in priorities arises between the asset owner and the appointed
manager, leading to potential conflicts of interest.
π Skin in the Game by Nassim Nicholas Taleb.
Second-Order Thinking πββοΈ
Consider not only the immediate consequences of a decision but also its subsequent
effects.
In other words, always ask βAnd then what?β
Opportunity Cost βοΈ
The potential benefit or value forfeited when choosing one alternative over
another.
Barbell strategy ποΈββοΈ
Balancing risk and reward by investing in high-risk and no-risk assets, while
avoiding moderate-risk options.
It's a numbers game π―
Success often depends on measurable key performance indicators (KPIs) that quantify
results.
To Care and Communicate π’
Achieving success often requires genuine passion for your work and effective
communication of your ideas.
Network Effects πΈ
The value of a good or service increases as more people use or participate in it.
Ockham's Razor βοΈ
When faced with multiple explanations, the simplest one is often the most likely to
be correct.
Parkinson's Law π°οΈ
Work expands to fill the time available for its completion, implying that setting
shorter deadlines can improve productivity.
Dunning-Kruger Effect π€¦ββοΈ
The cognitive bias where individuals with limited knowledge or competence
overestimate their abilities, while experts may underestimate their expertise.
NP vs P β±
If the solution to a problem is easy to check for correctness, must the problem be
easy to solve? (For example, cracking a password).
The general class of questions for which some algorithm can provide an answer in
polynomial time is "P". For some questions, there is no known way to find an answer quickly, but if one is
provided with information showing what the answer is, it is possible to verify the answer quickly. The class
of questions for which an answer can be verified in polynomial time is NP, which stands for
"nondeterministic polynomial timeβ.
Cognitive Biases π§
Systematic errors in thinking that affect human behavior and perception of reality.
Groupthink π: The tendency for individuals within a group to conform to a single decision,
often resulting in irrational outcomes.
More biases over here.
Logical Fallacies π€
Flawed reasoning or invalid deductive arguments that hinder effective
decision-making.
Strawman π: Misrepresenting an opponent's argument to make it easier to attack.
Sunk Cost Fallacy π³οΈ:The tendency to continue investing in a decision based on the amount
of resources already committed, rather than evaluating the current and future value.
More fallacies over
here.
Data Science Glossary π¨βπ¬
Machine Learning Models
Linear regression: A statistical model that tries to find the relationship between a
dependent variable and one or more independent variables by fitting a linear equation to the observed data.
It is used for predicting continuous-valued outputs.
Logistic regression: A classification algorithm used to model the probability of a certain
class or event existing, such as pass/fail, win/lose, or healthy/sick. It works by fitting a logistic curve
to the given data, which can then be used to predict the probability of binary outcomes.
Decision trees: A non-parametric supervised learning method used for classification and
regression. The model learns to make decisions by recursively splitting the input data into subsets based on
the values of input features, constructing a tree-like structure.
Random forest: An ensemble learning method that constructs multiple decision trees during
training and combines their predictions to produce a more accurate and robust output. It helps to overcome
overfitting and improve generalization.
Gradient boosting: A machine learning technique that builds an ensemble of weak prediction
models, typically decision trees, and iteratively improves them by minimizing the loss function using
gradient descent. It is used for regression and classification problems.
Neural networks (CNN): A type of artificial neural network called Convolutional Neural
Network (CNN) is designed to process grid-like structured data, such as images. CNNs are composed of
multiple layers that learn to recognize patterns and features in the input data through a process called
convolution, making them particularly effective for image recognition tasks.
Clustering (K-Means): An unsupervised learning algorithm that partitions a dataset into K
distinct clusters based on similarity. The algorithm works by iteratively assigning data points to the
nearest cluster center and updating the cluster centers based on the average of the data points within each
cluster.
Dimensionality reduction (PCA, SVD): Techniques used to reduce the number of features in
high-dimensional datasets, making it easier to analyze and visualize the data. Principal Component Analysis
(PCA) and Singular Value Decomposition (SVD) are two popular methods that work by transforming the original
data into a lower-dimensional space while preserving as much of the information as possible.
Time Series: A set of techniques and models used for analyzing and forecasting data that is
indexed in time order, such as stock prices or weather data. Time series models consider patterns, trends,
and seasonal effects in the data to make predictions about future values.
Support Vector Machines (SVM): A supervised learning algorithm used for classification and
regression tasks. SVMs work by finding the optimal hyperplane that best separates the data points of
different classes or predicts the target value with the smallest error.
Naive Bayes: A family of probabilistic classifiers based on Bayes' theorem, which assumes
that the features are conditionally independent given the class. Despite its simplicity, Naive Bayes is
often effective for text classification tasks, such as spam filtering and sentiment analysis.
k-Nearest Neighbors (k-NN): A non-parametric, lazy learning algorithm used for
classification and regression tasks. It works by finding the k training samples closest to a new input and
predicting the class or value based on a majority vote or weighted average of these neighbors.
Deep Learning (RNN, LSTM, Transformer): A subset of neural networks that focuses on deep
architectures with many layers. Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) networks, and
Transformers are popular deep learning models for handling sequential and time-series data, natural language
processing, and other complex tasks.
Reinforcement Learning (Q-Learning, DDPG): A type of machine learning that focuses on
training agents to make decisions in an environment to maximize cumulative rewards. Q-Learning and Deep
Deterministic Policy Gradient (DDPG) are popular reinforcement learning algorithms.
Autoencoders: A type of unsupervised neural network used for dimensionality reduction,
feature learning, and anomaly detection. Autoencoders learn to compress input data into a lower-dimensional
representation and then reconstruct the original input from this representation.
Generative Adversarial Networks (GANs): A class of deep learning models that consist of two
neural networks, a generator, and a discriminator, which are trained simultaneously in a zero-sum game. GANs
can generate new data samples that resemble the original data distribution, making them useful for tasks
such as image synthesis and data augmentation.
Data Fallacies
Cherry picking: The act of selectively presenting data points or findings that support a
specific conclusion, while ignoring or downplaying evidence that contradicts it. This practice can lead to
incorrect conclusions and misleading interpretations of data.
Survivorship bias: The error of focusing on the data points or elements that have
"survived" a selection process, while overlooking those that did not. This can lead to inaccurate
conclusions and predictions, as the survivorship effect may skew the results.
False causality: The assumption that a causal relationship exists between two variables
simply because they are correlated. This fallacy is also known as "correlation does not imply causation," as
correlation does not always indicate a direct causal relationship.
Sampling bias: The presence of systematic errors in the data that result from a non-random
or unrepresentative sampling process. This can lead to incorrect inferences about the population, as the
sample may not accurately reflect the characteristics of the larger group.
Observer effect: The phenomenon where the act of observing or measuring a system can change
its behavior or properties. In data analysis, this can lead to skewed results if the presence of the
observer or data collector influences the data being collected.
Gambler's fallacy: The mistaken belief that past events can influence the probability of
future independent events. For example, assuming that a coin toss is more likely to land on heads after a
series of tails, despite the fact that each toss is independent and has an equal probability of landing on
either side.
Regression towards the mean: The phenomenon where extreme values in a dataset tend to be
followed by values closer to the mean. This can lead to incorrect conclusions, such as attributing a change
in performance solely to a specific intervention when it may be due to natural variability.
McNamara fallacy: The error of relying solely on quantitative data and metrics to make
decisions or evaluate performance, while ignoring qualitative factors that may be difficult to measure. This
can lead to suboptimal decision-making and an overemphasis on easily quantifiable variables.
Simpson's paradox: A statistical phenomenon where a trend or relationship between two
variables reverses or disappears when the data is aggregated or separated into groups. This paradox
highlights the importance of considering the underlying structure of the data when analyzing relationships
between variables.
Underfitting vs. Overfitting (Bias-Variance tradeoff): A common issue in machine learning
where a model is either too simplistic (underfitting) or too complex (overfitting). Underfitting occurs when
a model has high bias and does not capture the underlying patterns in the data, leading to poor performance.
Overfitting occurs when a model has high variance and captures noise in the data, leading to poor
generalization to new data. Balancing the tradeoff between bias and variance is key to building effective
models.
Ecological fallacy: The error of making inferences about individuals based on aggregated
data from a group. This fallacy arises when relationships observed at the group level do not necessarily
hold true for individuals within that group.
Confounding variable: A variable that is correlated with both the independent and dependent
variables in a study, causing a distortion in the observed relationship between them. Failing to account for
confounding variables can lead to false conclusions about causality.
Selection bias: The presence of systematic errors in a study due to the non-random
selection of participants, variables, or data points. This can lead to incorrect inferences and
generalizations, as the selected sample may not accurately represent the population of interest.
Confirmation bias: The tendency to search for, interpret, and recall information in a way
that confirms one's preexisting beliefs or hypotheses. This can lead to biased data analysis and
conclusions, as researchers may unconsciously favor evidence that supports their views and disregard
contradictory evidence.
Availability heuristic: The cognitive bias that leads people to overestimate the importance
or likelihood of events based on their availability in memory. In data analysis, this may cause certain
events or trends to be given more weight than they deserve, simply because they are more easily recalled.
Anchoring effect: The cognitive bias that occurs when an initial piece of information is
used as a reference point for subsequent judgments and decisions. In data analysis, this can lead to biased
estimates and conclusions, as the initial value can unduly influence subsequent evaluations.
Base rate fallacy: The error of ignoring or underestimating the base rate (prior
probability) of an event when evaluating the likelihood of that event occurring. This can lead to
misinterpretations of probability and overconfidence in the predictive power of specific indicators or
pieces of evidence.