Unveiling the Secrets of Investing: Why You're Ready to Dive In Now

As we journey through the maze of financial literacy, one lesson stands out: the art of investing. Picture this – you've mastered the art of saving pennies and living within your means, but what about putting those hard-earned dollars to work for you? That's where investing steps into the spotlight.

Let's rewind a bit. Growing up, I was all about pinching pennies and stashing cash in my piggy bank. But the world of investing? It seemed like a distant galaxy, reserved for the elite with overflowing bank accounts. Little did I know, the key ingredients for investing weren't just heaps of cash – they were money and time.


Now, here's the scoop: while you might not have stacks of cash lying around, you've got something even more valuable – time. Yep, time is your secret weapon in the world of investing. Starting early means your money has more time to grow and flourish, like planting seeds in a garden and watching them sprout into mighty oak trees. Here are some charts showing why starting early is important.

But hold up – there used to be a catch. Remember those pesky commissions? Yeah, they were like the gatekeepers to the investing world, demanding hefty tolls for each trade. Imagine trying to invest $10 but getting slapped with a $5 fee – talk about a buzzkill! But fear not, my friends, because the tides have turned. Commission-free investing is now the norm, thanks to platforms like Robinhood, Fidelity, and E*TRADE among many others.

And that's not all – say hello to fractional shares, the game-changer for budding investors everywhere! Got your eye on a pricey stock or ETF but only have a few bucks to spare? No problemo! With fractional shares, you can own a piece of the pie without breaking the bank. It's like buying a slice of your favorite pizza instead of the whole pie – affordable and oh-so-satisfying.


Now, let's talk about the elephant in the room – market knowledge. Sure, you might not have time to pore over financial reports and dissect market trends. But guess what? That's totally okay! You don't need to be a Wall Street wizard to make savvy investment decisions. Instead, focus on low-cost, diversified options like index funds and ETFs. They're like the Swiss Army knives of investing – versatile, reliable, and beginner-friendly. Take a look at VTI, great place to start.

In a nutshell, there's never been a better time to dip your toes into the world of investing, especially for young bucks. Time is on our side, and with just a few bucks, we can kickstart our investment journey. So go ahead, take the plunge – whether it's $1 or $100, the world of investing is yours for the taking. And the best part? You can set it and forget it, letting your money work its magic while you focus on the fun stuff. Cheers to a bright financial future!



Compound Interest Calculator with Contribution

Albert Einstein is famously quoted for saying that "Compound interest is the 8th wonder of the world. He who understands it, earns it; he who doesn't, pays it". Thankfully none of us need to be Albert Einstein to understand the rather simple and yet powerful concept of Compound Interest. There are a lot of compound interest calculators out there. I particularly like the Compound Interest Calculator from Investor.gov. The main reason I like this calculator is that it allows you to include Monthly Contributions, which if you are a serious investor, should be part of your strategy.  Let's take a look at how it works. First, you need to enter some general information such as starting amount, monthly contribution, length of time and estimated interest rate. For this example, I will use the numbers below:



Once you have entered the information above, all you need to do is click "CALCULATE" and the graph below will appear showing you how your account balance is expected to grow over time. Additionally, you can also get the data in table form. Another reason I like this particular calculator is because it is so simple and yet it provides you with very powerful option beyond monthly contributions such as interest rate variance and compounding frequency.







So, if you are looking for a compound interest calculator, give this one a try!

Why use a Design-of-Experiment (DOE) Matrix

The word Design-of-Experiment, or DOE, has become very popular among engineers. However, often times this word is misused and misunderstood. The principles of DOE are based on the capabilities and limitations of the analysis tools that will be used to process the data and determine cause and effect. Sometimes proper DOE matrix are followed but adequate analysis is not performed. Other times, a proper DOE matrix is not followed but yet we try to use analysis methods that would have required a proper DOE matrix. Even more confusing is when we use the word DOE when neither a DOE matrix or analysis method was followed. 

First, it is important to acknowledge that DOE results are intended to be analyzed using regression analysis and something called "response surface method" or RSM. These methods rely on using a proper DOE matrix to help with variable independence or orthogonality and statistical significance. 

Let's take a look at the following example. Say you have 4 variables in a process (A, B, C and D) and you want to understand their impact in your process outcome. You conduct the following study where -1, 0, and 1 represent each variable min, middle, and max value:


You may be tempted to say that you have conducted a DOE. However, the table above has not followed any proper DOE matrix design. Even with the table above, it will be difficult to understand the effect of each variable (main effects) since some variables are strongly correlated. For example, let's take a look at the relationship between Parameter A and Parameter B:


Note from the graph above that the coefficient of determination, R^2, is quite high, indicating that Parameter A and Parameter B are confounded. When parameter A goes up, Parameter B also often goes up. It will be difficult for regression analysis to decouple the effect of parameter A vs. Parameter B. If you created a table of R^2 for all the variables, it will look like this (correlation table):




In the table above, you expect to have "1" along the diagonal, but you want to minimize the non-diagonal terms (ideally they would be zero). If you are a MATLAB user, you can use the function "corrplot(X)" to graphically obtain the table above or "corrcoef(X)" to get the table in a variable. Please note that MATLAB will plot R and not R^2. You can also use the DOE Diagnostic, Evaluate design option in JMP to do the same. Other programs like R and Python have similar functionality. Some programs plot R instead of R^2 but the essence of the message is the same.

The study matrix above also present troubles if there are significant variable interactions. For example, let's take a look at the correlation between Parameter A*C Interaction vs. and Parameter A*D Interaction:




Note that the interaction between Parameter A*C and Parameter A*D are also strongly correlated. It will be difficult for the statistical tools to tell the difference between these two interaction effects. The complete correlation table (these are R^2 Values) including 2-level interactions and square terms is shown below. By exploring the table below you will see that there is a lot of variable confounding in this study.




Ideally, if you have 4 parameters at 3 Levels, you would want to do a Full Factorial (FF) DOE which will result in 3^4 = 81 runs (#Levels^#Factors = # runs). If you did this, you would have perfect decoupling and the DOE would be perfectly orthogonal.  The correlation matrix (now in absolute value of R) would look something like this with 1 along the diagonal and zero everywhere else. This design is ideal to find all the main effects and interactions as well as squared terms.




However, 81 runs is a lot of runs. You may perhaps be limited to a much smaller number of runs. Here is where you can choose some other DOE matrix designs such as a central composite design (CCD). A CCD design will look like this having a total of 26 runs.


The corresponding correlation plot will look like this. Note that only the square terms (X1^2, X2^2, X3^2, and X4^2) are confounded but not the first level interactions.



You may still be limited to a much smaller number of runs. After all, the original study only had 6 runs. Here is where one must understand the trade-offs and limitations of what you can do with such a small number of runs and large number of factors and levels. You can use something called D-optimal design to find the an optimal design with a constrain in the number of runs given that perhaps you are most interested in the main effects. For example, the table below is from a D-optimal design focusing around the main effects:







In the correlation matrix above one can see that the main effects have been decoupled as much as possible but some interactions are stilled coupled to main effects. This matrix was created to decouple the variables as much as possible given only 6 runs. 



In summary:

1. Don't use the word DOE if you have not followed proper DOE matrix designs and/or used adequate regression modeling to evaluate your results for significant cause and effect

2. Do spend some time comparing DOE matrix design tables to understand the trade-off with number of runs and variable confounding and thus analysis capabilities

3. Be aware of the modeling limitations you may have (linear, second order, main effects, interactions, main effect second order, etc.)

4. Always use your engineering judgement when performing DOE studies. Follow statistical recommendations as long as they make engineering sense for your application

Monte Carlo Analysis: Right Triangle

Here is a very simple problem. Assume you want to build a right triangle of known base (10) and height (3) dimension. However, the sigma tolerance of these dimensions are 1% of its value. Your goal is to find the variability that you will obtain for the dimension of the hypotenuse and the largest of the two complementary angles. It is straight forward to do some simple math to figure out the nominal value for the two variables of interest. This is shown in the figure below:


Ruander Right Triangle Monte Carlo




Note that the variables of interest are not linear with respect to the known variables. Therefore, it is not possible to simply add the variances of the two known dimensions. It is possible to use the principles of propagation of uncertainty to propagate the known variation through the non-linear equations. This exercise will be left to the reader. Here, I will show how you can quickly use my Excel Monte Carlo Workbook to solve this problem. First, download the free version of this workbook or if you wish, you can also purchase the non free version here.  Set up the problem with 2 input variables of normal distribution with the given mean and sigma values. The inputs will look like this: 

Monte Carlo Excel



After setting up the problem, simply  run a large enough number of simulations by entering the number of simulations in cell F6 and then click the "2. Simulate" bottom. I recommend running at least 500K simulations. For this example, I will run 1M simulations. After the simulation is completed, just type in the formula to compute the hypotenuse and the angle as shown in the first figure above. Drag this equation down for all the simulations performed. 



Finally, simply compute the average and standard deviation for the variables of interest. The results are the following:




As mentioned earlier, if you follow the principles of uncertainty propagation using partial derivatives, you will arrive at this same result (Rule No. 5 here). This is shown in the table above under the Analytical column. Using the Monte Carlo Excel you can also visualize the distribution of these two variables.



I hope this example is useful to help identify other relevant applications where a Monte Carlo Analysis can help you understand variation of non-linear problems.

Monte Carlo: Roll 2 Dice

Sometime ago I posted about the Sum of Two Dice. In that post, I explained the math behind it and I also performed a Monte Carlo Simulation in R. I would like to go back to this problem to show how my free Monte Carlo Simulation Excel file can be used in this situation.

Assuming a fair die, each die has six different possible output values (1, 2, 3, 4, 5, and 6) and the probability of any of these values should be equal (1/6). The probability function that describes this behavior is the uniform discrete probability function with a lower bound of 1 and an upper bound of 6. The reason that this is a discrete probability is that in between values that are not integers (such as 1.5) are not an option.

Using my Monte Carlo Simulation Excel file to simulate the sum of two dice is very straight forward. First, click "0. Reset"  in the Simulator sheet to make sure you are starting with a fresh and clean sheet. Indicate that the number of inputs is 2 by typing this value in cell F5. Remember, each input will represent one die. Now, click on "1. Define Inputs" and select the Uniform_Discrete probability function for both inputs with 1 as the lower bound (parameter 1) and 6 as the upper bound (parameter 2). Leave the other parameters empty. Your Simulator sheet should look like this:
Now indicate the number of simulations to perform by typing the number of simulations in cell F6. For this example, I will perform 1 million simulations knowing that this large number of simulations will likely take some time to complete so I will have to be patient. It took ~30 mins in a very slow computer on battery. Note that if I would have done 100K simulations it would have only taken about 3 minutes (time is approximately linear with number of simulations). Faster computers will finish faster.

Once the simulation is completed, there are a couple of things to do. First, in the "Y=" column, you will want to type the equation which is basically the sum of die 1 and die 2. This should look like this:


First, let's explore the output of each die. My Monte Carlo Simulation Excel file automatically outputs a histogram of your X1 variable if you have excel 2016 or higher. However, for discrete outcomes, I like making my own using the Excel Function COUNTIF. Doing this I obtained very similar outputs for both X1 and X2. Below I show the output for X1. Note that the number of occurrences for each number is about 1/6 of the total as expected.


In the same way, we can plot the histogram of our Y= X1 + X2. Note that in accordance with my previous post, the number 7 occurred approximately 16.6667% of the time. It is pretty interesting how the sum of two variables that have a uniform discrete probability function results in an output variable with a discrete triangular distribution function.



Now that we have looked into the sum of 2 dice, how about the sum of 3 dice, or 7 dice? Can you use my Monte Carlo Simulation Excel file to do this with more than 2 dice?