S&P500 Dashboard

Posted on Feb 4, 2018

This is my Shiny Dashboard for stocks listed on S&P500. Using R shiny to give a visualized general insight of market and using ARIMA model to analyze risk for a single stock.


Background

When I worked as a trader. I found that it was really inconvenient to do the daily market research because the informationย on the mainstream website could be confusing and overwhelming sometime. As a result, it was difficult to find a single website that can give me all the informationย that I prefer to use. So, I usually had to check many differentย websites at the same time to extract the informationย that I was interested in. Even though I successfully finished gathering all informationย together, I might not be able to get a big picture of the whole market because there were so many noisesย in the information I gathered. So, I made this dashboardย which can help me to extract information and do the basic fundamental analysis for me.

Introduction

This is a dashboard for the S&P 500 stock market to give the userย a general insight of stock market. And it is based on theย  R studio Shiny package.ย  The main dataset comes from Kaggle. What's more, I useย Alpha Vantageย API to get the price movement of sector ETFs listed on SPDR. In order to process the dataset, I use dplyr and tidyr.ย Besides, in order to visualize market major trend, I also use ggplot2, plotly, word cloud, andย googlevis. Then, I use ARIMA model to do price risk analysis. If you are interestedย in, this is the link to myย Github, and this is the output of myย ShinyApp.IO.

App Detail

Word Cloud

Based on web scraping and word cloud visualization, this page can give users the hot words that were discussed on any date that has Nasdaq historical news page. Users only need to copy the URL of Nasdaq news page and paste in the URL input. The word cloud will change automatically.

Main tab

There are four plots on this page. All of them are based on the sector user choose. Following will introduce the details of each plot.

1.Treemap

 

Usersย can choose the date they are interested in by using the date selector in the sidebar.The size of each node is based on the volume of the ETF or stock. And the color means the gain or loss of each ETF or stock. Users can use click on ETF nodes to zoom in and check the stocks that are listed in the selected sector.

2.Barplot and Heatmap

 

Users can choose the indicator and the sector they are interested in and plot on the bar plot. Both of sector and indicator can be chosen in the sidebar. At the same time, the correlation between each stock and the SPY will be plot as a heat map.

3.Price movement

Users can choose the stock they are interested in and plot them as candlestick on these two plots. Users can also select the two SMA they prefer to use.

Time Series Analysis

Time series analysis for the stock users chose. Using three parameters of ARIMA model to make visualizedย model test process. I include basic tests for ARIMA model. They are Box-test and Shapiro-test for the residuals of the model. By default, users can use 200 points to construct the model and use the rest 50 points to test model. The resultย of this accuracy test is displayed by using MAPE, MPE etc indicators. Users can choose which they prefer.

Future Improvement

I will make the word cloud can be controlled by using the date in the sidebar after I find a good website which has API or URL that can be parsed by only using a date. I also will include the natural language analysis for the news analysis. For the time analysis part, I will put the result of the optimized model by using auto.arima in R package.

Inspired by student projects? Now it's your turn.
Get information about our data science programs and see how we can help you launch your data science career.



About Author

Zhe Yang

Hi, My name is Zhe Yang. I got my master degree in Financial Analyst at Rutgers University. I love challenges and solving difficult problems. I used to be a trader in the T3 trading company. During I worked...
View all posts by Zhe Yang >

Related Articles

Leave a Comment

Zhe Yang February 7, 2018
Thanks! I enjoy using dygraphs to plot time series data, especially using candlestick chart with the time range.
Zhe Yang February 7, 2018
I am glad that you like it!
Petr Shevtsov February 6, 2018
Awesome work! I noticed you used dygraphs for time series. How do you like it?
Stephen E Jones February 6, 2018
Thanks!

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup music Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp