10 great R packages for stock market data (2024)

The R programming language is often used for statistical computing, which makes it the perfect candidate for financial analysis. Here at lemon.markets, we’re a big fan of R as it allows us (and more importantly, our users 🍋) to perform the necessary data exploration and analysis to inform (automated) trade decisions. lemon.markets is a Berlin-based 🇩🇪 brokerage API for developers, by developers such that they can build their own brokerage experience at the stock market. In this article, we’ve compiled a list of 10 great R packages that can be used to work with stock market data. After reading, you’ll be able to manipulate your data and begin performing technical analysis on it.

10 great R packages for stock market data (1)

The focus of this article is on stock market data — this means that we’re primarily concerned with importing, manipulating, visualising and reporting on data. And this is where R Shiny-s 😉 the brightest due to its statistics-focus! We also don’t want to overwhelm you with the large number of R packages available, we’re sure this article can have a part two (and three, and four, and…)

lemon.markets offers two APIs: the trading API and the market data API, so we’ll just be focusing on the latter in this article. After retrieving, for example, OHLC data on a few financial instruments from our API, there’s several directions you can take. Perhaps you want to use these prices to forecast price movement in the future, or maybe you want to use it to generate (real-time) technical indicators. Regardless of which path you choose to follow, you’ll need to collect, (pre-)process and maybe visualise your stock market data. We’ve collected packages that cover these three necessary steps.

Should I be using R? 🏴‍☠️

R is one of the most widely used languages in the data analytics sector. It’s primarily used in academics, but large companies such as Uber, Facebook and Airbnb also use R for data visualisation and statistical inference. One of the most powerful features of R is that it is open-source, and anyone can contribute their own R packages, which means that there’s almost endless options to choose from. We’ve actually had to narrow down the list of packages we’re going to share with you today.

When using R for finance, you’ll probably use some more general packages that deal with data management and other finance-specific packages. In fact, many packages actually build upon each other — you’ll come to learn that the R ecosystem is highly interconnected. ♻️

A quick note on the tidyverse 🧼

We can’t write an article about data manipulation without mentioning the tidyverse. It’s a collection of R packages designed specifically for data science. It’s somewhat akin to the SciPy stack for Python. The tidyverse can, as the name suggests, tidy up your data. But, it can also provide additional functionalities such as data visualisation and manipulation. We’ll get into it later, but if the name pops up, you’ll know what we’re talking about.

Collecting stock market data

The first step before we can do something with our market data, is to make sure that we actually have some. This means that we need to collect it. The lemon.markets Market Data API can be used to retrieve historic market data in H1/D1/M1 format, the latest quotes and the latest trades for specific instruments. For example, if you want to request hourly OHLC data for Apple, a request to our API can look as follows:

install.packages("httr")library(httr)market_url <- "https://data.lemon.markets/v1/ohlc/h1?isin=US0378331005"response <- httr::GET(url = market_url, add_headers( Authorization = paste("Bearer", YOUR-API-KEY) ) )

For simplicity’s sake, you can also use the R package set up by Mario at Quantargo, which you can find here! He’s one of our prized community members building things with and for the lemon.markets API to make it accessible to all kinds of developers #opensource. Quantargo is a platform that can help you build up data science skills, through courses, workshops (also for businesses) and a browser-based workspace where you can immediately deploy your projects. They also have a wonderful Introduction to R course, check it out.

📚Readr

When you are working with large datasets that require you to structure them, readr is the right choice for you. R has built-in commands for reading in rectangular data, whether that be a .csv, .tsv or .fwf. Parsing a file with readr allows you to specify the data type per column (or it will smartly guess it for you). In addition, readr will output a tibble, the workhorse of the tidyverse, which is a type of data.frame that allows more complexity in your data (compared to the native format in R). A simple implementation is as follows:

install.packages("readr")library(readr)data <- read_csv("filename.csv", col_types = list( var1 = col_double(), var2 = col_integer(), var3 = col_datetime()))

The package also allows you to read directly from an Excel spreadsheet or Google sheets, check out this cheatsheet to learn more.

Quantmod

The quantmod package can load data, chart data and obtain relevant technical signals. This package works with several sources, including (but not limited to) Yahoo Finance and FRED. But, it can also fetch data from something like an MySQL database. In the code snippet below, we show you how to load historical price data for AAPL from YahooFinance (do note that these prices are in USD):

install.packages("quantmod")library(quantmod)getSymbols("AAPL", src="yahoo")chartSeries(AAPL, subset="last 6 months", theme=chartTheme("white"))addMACD()

We chose to chart the last 6 months of the OHLCV (that is, Open High Low Close Volume) data, which can conveniently be specified verbatim (see the documentation for other formats). We also added the Moving Average Convergence Divergence (MACD) indicator, which shows the relationship between two moving averages of AAPL’s price. This produces the chart below:

10 great R packages for stock market data (2)

As you can see, quantmod can be used for (pre-)processing your data too. And there’s a lot more that can be done with it: this is just a taste. Try it out for yourself!

(Pre-)processing your stock market data

Obtaining raw data often means that you’ll need to perform one or more alterations on it — perhaps you have irrelevant data, missing values, data in the wrong format or you might want to obtain some metrics from this data. Welcome to the (pre-)processing stage, where there’s more than enough R packages to help you address the above issues.

⏰ Xts

install.packages("xts")library(xts)

The xts package is the package for handling time-series data (and it extends the popular zoo package, which means even more methods available to you). As financial data often takes the form of time series data, we expect xts objects to come in handy: think of them as time-indexed matrices. You can perform lots of different operations on these matrices, such as extracting time specific segments of data. For example, if you want to forecast prices, but you don’t want to include the volatile market open and close, you might choose to omit these two time intervals from your model. This guide gives a good overview of what can be done with the two packages.

🌪️ Dplyr

install.packages("dplyr")library(dplyr)

The dplyr package can be used for data manipulation, it can filter, sort, summarise, select and mutate your data. In financial analysis, this could be useful when you’re:

  • finding financial instruments that are related to each other,
  • obtain certain metrics e.g. standard deviation, mean, range, etc.
  • aggregating price information from different stock exchanges.This cheatsheet will tell you everything you need to know about using dplyr.

📅 Lubridate

Lubridate is yet another component of the tidyverse. This package’s role? Ensuring that your date-time objects are correctly formatted and/or combined. For example, the following code snippet,

install.packages("lubridate")library(lubridate)date <- as_datetime(1635592026)

will return 2021-10-30 11:07:06 UTC. It’s robust against timezones, leap years and anything other time anomalies you can think of. This might be useful if you’re working with more than one data source (with different formats) or if your trading platform only accepts certain formats.

🚦TTR

install.packages("ttr")library(ttr)

Technical Trading Rule (TTR) is a popular choice when it comes to technical trading signals. It includes over 50 technical indicators such as the more obscure Chande Momentum Oscillator (CMO) or the well-known Relative Strength Index (RSI). If you’d like to learn more about how trading signals can be used to motivate your strategies, you can read our article on beginner-friendly trading strategies.

TTR can also be used to obtain several volatility measures, such as True Range (TR) or the Chaikin Volatility (VT). You can use them to determine how much risk you are exposing yourself to and whether this aligns with your trading philosophy.

🧹 Tidyquant

Tidyquant is the bridge between the tidyverse and zoo, xts, quantmod and TTR. It basically makes working with the aforementioned packages easier by formatting the data in a tibble. For example, the data loading we did in the ‘quantmod’ section can be reformulated as:

install.packages("tidyquant")library(tidyquant)google <- tq_get(x = "GOOG")

this ensures a tibble as output, meaning we can use many of the featured data manipulation tools on the OHLCV data without having any formatting issues! See this page for the core functionalities of the package.

Visualising stock market data

Visualising your data can also be an important component in determining trade decisions. You might be able to spot patterns and anomalities that aren’t immediately apparent by looking at the raw price data, check out this article to get an idea.

📈 Ggplot2

Ggplot2 is another member of the tidyverse, it can be used to create graphs from your data and gain insight into your dataset. For example, we can plot the stock prices of two financial instruments on the same graph to (visually) determine whether there is co-movement (do note that this should be confirmed with a statistical test e.g. the Engle-Granger test):

install.packages(c("tidyquant", "ggplot2", "dplyr"))library(tidyquant)library(ggplot2)library(dplyr)multiple_stocks <- tq_get(c("GOOG", "AMZN"), get = "stock.prices", from = "2021-01-01", to = "2021-10-31")ggplot(data = filter(multiple_stocks, symbol == "GOOG" || symbol == "AMZN"), aes(x=date, color=symbol)) + geom_line(aes(x=date, y=open, color=symbol))

This code snippet will output a graph that looks as follows:

10 great R packages for stock market data (3)

From only visual inspection, it appears that the Google time series includes a drift (time trend), whereas Amazon appears to somewhat oscillate around a mean. These insights might inform us that these two stocks are likely not very appropriate in a Pairs Trading Strategy.
Check out this tutorial for more inspiration on how ggplot2 can be used for financial data.

Python & R

In the realm of data science, you’ll never have touchpoints with just one programming language and/or platform. For example, if you’re working with a multidisciplinary team, you might need to jump from one language to another, or find a way to integrate them into a single script. At lemon.markets, we’re partial to using both Python and R to design our trading strategies, therefore we thought it might be useful to find a way to combine the two.

➡️ Reticulate

The reticulate package allows you to embed a Python session within an R script, this makes the transition between the two more seamless. This could be useful if you are, for example, using R for data exploration and Python to automate your trading strategy.

All in all, there’s plenty of R packages that can be useful in the finance context. We’ve discussed the benefits to using the tidyverse (tibbles!), how certain financial packages can be used in conjunction with the tidyverse and how to obtain technical signals. But, the surface has only been scratched!
Are there any other R packages that you think are unmissable when it comes to finance and automated trading? Share them below! And if you’re not yet part of lemon.markets, join our waitlist, we’d love to see your R projects.

Marius from lemon.markets 🍋

10 great R packages for stock market data (2024)

FAQs

What is the R package for stock data? ›

Quantmod is an R package specifically designed for quantitative financial modeling and trading. It provides a wide range of functions and tools for collecting, analyzing, and visualizing financial and stock market data.

Is R good for stock analysis? ›

R is a powerful tool for data analysis in finance due to its flexibility, scalability, and ease of use. By following best practices such as cleaning and preparing data, using appropriate statistical techniques, visualizing data, and documenting code, R can be used to analyze financial data accurately.

What are the commonly used R packages? ›

The dplyr, stringr, and readr packages are necessary for data manipulation and wrangling. ggplot2 and leaflet are powerful tools for creating static, animated, and interactive graphics. caret is a popular choice for machine learning in R for beginners.

What are the R packages for financial analysis? ›

R is a language widely used in statistical computing and graphics. It is open-source, and it offers a wide variety of packages for financial analysis. Among these packages are PerformanceAnalytics, Quantmod, and Tidyquant, which are useful for data importing, data visualization, and performance measurement.

What is the best R package for time series forecasting? ›

R has at least eight different implementations of data structures for representing time series. We haven't tried them all, but we can say that zoo and xts are excellent packages for working with time series data and better than the others that we have tried.

How to create a vector of stocks in R? ›

To create a vector, we will use the 'c' function, which stands for _combine_. Here, we are combining three days worth of stock prices into one vector. Notice how there are commas separating each element in the vector. Now, typing in 'apple_stock' returns a vector of length three.

Which analysis is best for stock market? ›

Fundamental analysis is most often used when determining the quality of long-term investments in a wide array of securities and markets, while technical analysis is used more in the review of short-term investment decisions such as the active trading of stocks.

Can R be used for trading? ›

R is a language that is specifically designed for statistical analysis and data visualization. It is often used in combination with other languages, such as Python or C++, to develop algorithmic trading systems that require complex statistical models.

Which chart is best for stock analysis? ›

Line charts provide a simplified view of an asset's price movement by connecting closing prices with a line. To enhance your analysis, think about using a line chart when you want to see something over time as it's a great tool for trend analysis over a period.

What is the most downloaded R package? ›

CRAN R Packages by Number of Downloads
RankPackage NameDownloads
1ggplot2141,286,240
2rlang129,436,034
3magrittr121,835,743
4dplyr105,837,791
160 more rows

What are the 6 basic data types in R? ›

Basic Data Types
  • numeric - (10.5, 55, 787)
  • integer - (1L, 55L, 100L, where the letter "L" declares this as an integer)
  • complex - (9 + 3i, where "i" is the imaginary part)
  • character (a.k.a. string) - ("k", "R is exciting", "FALSE", "11.5")
  • logical (a.k.a. boolean) - (TRUE or FALSE)

What is the name of the biggest R package repository? ›

CRAN, The Comprehensive R Archive Network, is the primary package repository in the R community. CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R.

Do finance professionals use R? ›

Finance professionals are increasingly turning to R programming because it's ideal for data science, analysis, and visualization tasks.

Is R or Python more useful for finance? ›

R: R is mostly used by data scientists as it is used only for data analysis. But compared to Python, it has been outraced. As finance involves the calculation and analysis of data R would be best for you.

How can R be used in finance? ›

R is also a common symbol representing "return" in many financial formulas. There are many different types of returns and they are usually denoted with the upper or lower case letter "R," though there is no formal designation. If there are multiple returns used in a calculation, they are often given subscript letters.

What are the packages in the R library? ›

Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. R comes with a standard set of packages. Others are available for download and installation.

What does R Squared measure in stocks? ›

R-squared measures how closely each change in the price of an asset is correlated to a benchmark. Beta measures how large those price changes are in relation to a benchmark. Used together, R-squared and beta give investors a thorough picture of the performance of asset managers.

How is R used in big data? ›

Some notable reasons why R is used in Big Data are: Data Manipulation: R's packages like dplyr and data. table enable efficient data manipulation, filtering, and transformation, making it suitable for preprocessing large datasets.

Top Articles
Latest Posts
Article information

Author: Melvina Ondricka

Last Updated:

Views: 6309

Rating: 4.8 / 5 (48 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.