In a previous post I looked at retrieving a list of assets from the Alpaca API using the {alpacar}
R package. Now we’ll explore how to retrieve historical and current price data.
First you’ll need to load the package. If you haven’t installed it yet then you can follow the instructions here.
library(alpacar)
Before making any calls to the API you’ll also need to authenticate.
Historical Bars
The most common representation of asset price data is via some form of an OHLC chart, so this seems like a natural place to start. OHLC data can be retrieved as various resolutions using the bars()
function.
SPY <- bars("SPY", "1H", "2024-10-30", "2024-10-31")
That retrieves a single day of OHLC data for SPY at 1 hour resolution.
Both the symbol
and timeframe
parameters need to be specified. The symbol can be any one of the assets listed in the previous post. The time frame should have one of the following units:
T
— minuteH
— hourD
— dayW
— week orM
— month.
The remaining arguments are the start and stop times.
💡 You can get data for multiple assets by specifying a vector of symbols.
timestamp open high low close volume trades vwap
1 2024-10-30 08:00:00 582.89 583.23 582.5 582.66 25070 459 582.87650
2 2024-10-30 09:00:00 582.58 583.13 582.5 582.92 11240 248 582.85297
3 2024-10-30 10:00:00 582.95 583.05 582 582.16 112383 1002 582.56281
4 2024-10-30 11:00:00 582.13 582.65 581.77 581.91 179837 1562 582.29877
5 2024-10-30 12:00:00 582.37 583.06 581.39 581.55 307287 3514 581.87937
6 2024-10-30 13:00:00 581.52 581.91 579.29 581.85 4649025 50171 580.79442
7 2024-10-30 14:00:00 581.84 582.96 581.7401 582.3799 3837746 86580 582.43233
8 2024-10-30 15:00:00 582.3799 583.32 581.59 583.105 3152102 38712 582.55026
9 2024-10-30 16:00:00 583.1 583.21 581.99 582 1971462 28405 582.55270
10 2024-10-30 17:00:00 581.99 582.305 580.29 580.64 4096049 40733 581.41390
11 2024-10-30 18:00:00 580.624 581.81 580.624 581.18 3929147 36692 581.35409
12 2024-10-30 19:00:00 581.17 581.17 579.38 579.97 11065819 84075 580.10523
13 2024-10-30 20:00:00 579.94 580.61 578.75 579.22 6919995 10759 579.77867
14 2024-10-30 21:00:00 579.2501 579.37 578.43 578.45 542453 1658 578.96646
15 2024-10-30 22:00:00 578.61 579.08 577.87 578.13 250834 2252 578.21520
16 2024-10-30 23:00:00 578.1199 578.3 577.8 577.9 75770 922 577.99987
I dropped the symbol
column for brevity. There are three extra columns that merit an explanation:
volume
— the total number of shares traded;trades
— the number of individual transactions; andvwap
— the Volume-Weighted Average Price, which considers the price and volume for each trade during the interval and calculates the average price weighted by the volume. This gives a more accurate indication of the effective price at which the assets traded.
We can use chartSeries()
from {quantmod}
to generate a decent plot from those data.
How about a longer series of data at daily resolution? Let’s get AAPL for the second quarter of 2024.
AAPL <- bars("AAPL", "1D", "2024-04-01", "2024-06-30")
Here’s the corresponding plot.
Latest Bars
The bars_latest()
function will return the latest bars at 1 minute resolution. You can retrieve data for one or more symbols.
bars_latest(c("AAPL", "NVDA", "TSLA", "F", "AVGO", "ARM", "TSM", "QCOM"))
symbol timestamp open high low close volume trades vwap
1 ARM 2024-10-31 15:07:00 141.65 141.87 141.65 141.775 729 12 141.72286
2 AVGO 2024-10-31 15:07:00 168.9 169.19 168.9 169.14 3173 50 169.0625
3 F 2024-10-31 15:07:00 10.3 10.305 10.3 10.305 6319 13 10.301756
4 TSLA 2024-10-31 15:07:00 251.42 252.23 251.42 252.21 2021 42 251.89538
5 NVDA 2024-10-31 15:07:00 133.42 133.84 133.405 133.805 10303 84 133.71516
6 AAPL 2024-10-31 15:07:00 227.22 227.68 227.22 227.555 7731 91 227.53224
7 QCOM 2024-10-31 15:07:00 163.81 163.88 163.81 163.86 1041 35 163.85
8 TSM 2024-10-31 15:07:00 188.225 188.73 188.225 188.67 1456 25 188.6075
Live Quotes
Suppose you want to see the current prices rather than the historical aggregates. Now you need to access live quotes. Use the quotes_latest()
function to retrieve the most recent quoted prices.
quotes_latest("AAPL")
symbol timestamp ask ask_size ask_exch bid bid_size bid_exch cond tape
1 AAPL 2024-10-31 15:10:42 227.3 2 V 226 1 V R C
As with the historical data you can retrieved quotes for multiple symbols.
quotes_latest(c("TSLA", "F", "AVGO", "ARM", "TSM", "QCOM"))
symbol timestamp ask ask_size ask_exch bid bid_size bid_exch cond tape
1 TSM 2024-10-31 15:11:40 190.5 5 V 188.1 1 V R A
2 ARM 2024-10-31 15:11:41 145.1 1 V 139 1 V R C
3 AVGO 2024-10-31 15:11:40 170.2 1 V 168.8 1 V ? C
4 F 2024-10-31 15:11:39 10.3 181 V 10.3 34 V R A
5 TSLA 2024-10-31 15:11:40 251.3 1 V 251.2 1 V R C
6 QCOM 2024-10-31 15:11:36 179 1 V 163 1 V R C
The quotes data contain the following fields:
symbol
timestamp
ask
— the ask priceask_size
— the number of lots available at the ask priceask_exch
— the exchange ID (for the exchange that provided the ask price)bid
— the bid pricebid_size
— the number of lots available at the bid pricebid_exch
— the exchange ID (for the exchange that provided the bid price)cond
— any special conditions associated with the quote andtape
— the consolidated tape category.
See below for more information on exchange IDs and condition codes. For most stocks the lot size is 100.
Exchange Codes
To understand the exchange codes you can retrieve a lookup table:
exchange_codes()
code name
1 A NYSE American (AMEX)
2 B NASDAQ OMX BX
3 C National Stock Exchange
4 D FINRA ADF
5 E Market Independent
6 H MIAX
7 I International Securities Exchange
8 J Cboe EDGA
9 K Cboe EDGX
10 L Long Term Stock Exchange
11 M Chicago Stock Exchange
12 N New York Stock Exchange
13 P NYSE Arca
14 Q NASDAQ OMX
15 S NASDAQ Small Cap
16 T NASDAQ Int
17 U Members Exchange
18 V IEX
19 W CBOE
20 X NASDAQ OMX PSX
21 Y Cboe BYX
22 Z Cboe BZ
💡 Since these data are static, the results from the exchange_codes()
function are cached. There’s no overhead associated with multiple calls.
Looking at the latest quotes data above we can see that all of those data come from the IEX exchange. Incidentally, this is the exchange covered in detail in Michael Lewis’s “Flash Boys”.
Condition Codes
You can also get an explanation of the condition codes. The codes vary according to the value of tape
. For example, here are the codes for tape category C:
condition_codes("C")
code meaning
1 4 On Demand Intra Day Auction
2 A Manual Ask Automated Bid
3 B Manual Bid Automated Ask
4 F Fast Trading
5 H Manual Bid And Ask
6 I Order Imbalance
7 L Closed Quote
8 N Non Firm Quote
9 O Opening Quote Automated
10 R Regular Two Sided Open
11 U Manual Bid And Ask Non Firm
12 X Order Influx
13 Y No Offer No Bid One Sided Open
14 Z No Open No Resume
💡 These results are also cached.
The latest quotes data above mostly have condition code R
, which indicates that they represent “Regular Two Sided Open” orders.
Historical Quotes
You can also get historical quotes data via the quotes_history()
function.
aapl <- quotes_history("AAPL", "2024-06-21 19:59:30", "2024-06-21 20:00:30")
That can produce a lot of data, so let’s just look at just the first few records. I’ll omit a few columns for clarity.
timestamp ask ask_size ask_exch bid bid_size bid_exch
1 2024-06-21 19:59:30 207.96 6 Q 207.95 24 Q
2 2024-06-21 19:59:30 207.96 1 P 207.95 24 Q
3 2024-06-21 19:59:30 207.96 1 P 207.95 25 Q
4 2024-06-21 19:59:30 207.97 2 Q 207.95 25 Q
5 2024-06-21 19:59:30 207.97 2 P 207.95 26 Q
6 2024-06-21 19:59:30 207.96 9 Q 207.95 26 Q
Those quotes now come from a variety of exchanges. See section below on data feeds.
I’m not sure about the best way to present these data. Below is an example of something that I’m experimenting with, which shows the individual quotes as points. The vertical axis reflects price. The points have 25% opacity, so multiple quotes will build up the colour. The size of the points scales with the quote size. The side is encoded by colour, with bid in red and ask and in blue. I added a quotes_tidy()
utility function that will transform the data into a form suitable for generating this plot.
Something interesting definitely seems to have occurred just before the market closed. Let’s zoom in on that.
The bid sizes kick off around 19:59:55. Sudden selling pressure drives the price down before the market closes. After the close the price swiftly recovers.
Let’s delve a bit deeper into this event. Below are the volume and trades for that day at time various resolutions. The volume is scaled down by a million and the trades by a thousand. Starting with hourly bars for the full day.
The largest number of trades occurs between 18:00 and 19:00. However, these were all relatively small trades on average. The last hour of the trading day, between 19:00 and 20:00, was dominated by a smaller number of very large trades, as indicated by the massive jump in volume.
Now zoom in to 5 minute bars centred on the NYSE close of trade (20:00 UTC).
A similar pattern emerges, with the number of trades escalating until the 19:50 bar, before dropping in the 19:55 bar, where the volume peaks.
Zooming even further, down to 1 minute resolution.
Here we can see that the last minute of the trading day was dominated by a few large trades, in keeping with the historical quotes plot above.
Data Feeds
By default the latest quotes are retrieved from the IEX exchange. If you have a SIP subscription then you can use the feed
argument to change the data source. The data from SIP is more extensive than that from IEX.
Historical quotes all come from SIP, so unless you have a SIP subscription you really only lose out on more detailed data for the latest quotes.