Asset Price Data

In a previous post I looked at retrieving a list of assets from the Alpaca API using the {alpacar} R package. Now we’ll explore how to retrieve historical and current price data.

First you’ll need to load the package. If you haven’t installed it yet then you can follow the instructions here.

library(alpacar)

Before making any calls to the API you’ll also need to authenticate.

Historical Bars

The most common representation of asset price data is via some form of an OHLC chart, so this seems like a natural place to start. OHLC data can be retrieved as various resolutions using the bars() function.

SPY <- bars("SPY", "1H", "2024-10-30", "2024-10-31")

That retrieves a single day of OHLC data for SPY at 1 hour resolution.

Both the symbol and timeframe parameters need to be specified. The symbol can be any one of the assets listed in the previous post. The time frame should have one of the following units:

  • T — minute
  • H — hour
  • D — day
  • W — week or
  • M — month.

The remaining arguments are the start and stop times.

💡 You can get data for multiple assets by specifying a vector of symbols.

   timestamp               open    high      low    close   volume trades      vwap
 1 2024-10-30 08:00:00 582.89   583.23  582.5    582.66      25070    459 582.87650
 2 2024-10-30 09:00:00 582.58   583.13  582.5    582.92      11240    248 582.85297
 3 2024-10-30 10:00:00 582.95   583.05  582      582.16     112383   1002 582.56281
 4 2024-10-30 11:00:00 582.13   582.65  581.77   581.91     179837   1562 582.29877
 5 2024-10-30 12:00:00 582.37   583.06  581.39   581.55     307287   3514 581.87937
 6 2024-10-30 13:00:00 581.52   581.91  579.29   581.85    4649025  50171 580.79442
 7 2024-10-30 14:00:00 581.84   582.96  581.7401 582.3799  3837746  86580 582.43233
 8 2024-10-30 15:00:00 582.3799 583.32  581.59   583.105   3152102  38712 582.55026
 9 2024-10-30 16:00:00 583.1    583.21  581.99   582       1971462  28405 582.55270
10 2024-10-30 17:00:00 581.99   582.305 580.29   580.64    4096049  40733 581.41390
11 2024-10-30 18:00:00 580.624  581.81  580.624  581.18    3929147  36692 581.35409
12 2024-10-30 19:00:00 581.17   581.17  579.38   579.97   11065819  84075 580.10523
13 2024-10-30 20:00:00 579.94   580.61  578.75   579.22    6919995  10759 579.77867
14 2024-10-30 21:00:00 579.2501 579.37  578.43   578.45     542453   1658 578.96646
15 2024-10-30 22:00:00 578.61   579.08  577.87   578.13     250834   2252 578.21520
16 2024-10-30 23:00:00 578.1199 578.3   577.8    577.9       75770    922 577.99987

I dropped the symbol column for brevity. There are three extra columns that merit an explanation:

  • volume — the total number of shares traded;
  • trades — the number of individual transactions; and
  • vwap — the Volume-Weighted Average Price, which considers the price and volume for each trade during the interval and calculates the average price weighted by the volume. This gives a more accurate indication of the effective price at which the assets traded.

We can use chartSeries() from {quantmod} to generate a decent plot from those data.

Candlestick chart of SPY with volume histogram.

How about a longer series of data at daily resolution? Let’s get AAPL for the second quarter of 2024.

AAPL <- bars("AAPL", "1D", "2024-04-01", "2024-06-30")

Here’s the corresponding plot.

Candlestick chart of AAPL with volume histogram.

Latest Bars

The bars_latest() function will return the latest bars at 1 minute resolution. You can retrieve data for one or more symbols.

bars_latest(c("AAPL", "NVDA", "TSLA", "F", "AVGO", "ARM", "TSM", "QCOM"))
  symbol timestamp              open    high     low   close volume trades       vwap
1 ARM    2024-10-31 15:07:00 141.65  141.87  141.65  141.775    729     12 141.72286 
2 AVGO   2024-10-31 15:07:00 168.9   169.19  168.9   169.14    3173     50 169.0625  
3 F      2024-10-31 15:07:00  10.3    10.305  10.3    10.305   6319     13  10.301756
4 TSLA   2024-10-31 15:07:00 251.42  252.23  251.42  252.21    2021     42 251.89538 
5 NVDA   2024-10-31 15:07:00 133.42  133.84  133.405 133.805  10303     84 133.71516 
6 AAPL   2024-10-31 15:07:00 227.22  227.68  227.22  227.555   7731     91 227.53224 
7 QCOM   2024-10-31 15:07:00 163.81  163.88  163.81  163.86    1041     35 163.85    
8 TSM    2024-10-31 15:07:00 188.225 188.73  188.225 188.67    1456     25 188.6075

Live Quotes

Suppose you want to see the current prices rather than the historical aggregates. Now you need to access live quotes. Use the quotes_latest() function to retrieve the most recent quoted prices.

quotes_latest("AAPL")
  symbol timestamp             ask ask_size ask_exch   bid bid_size bid_exch cond tape
1 AAPL   2024-10-31 15:10:42 227.3        2 V          226        1 V        R    C

As with the historical data you can retrieved quotes for multiple symbols.

quotes_latest(c("TSLA", "F", "AVGO", "ARM", "TSM", "QCOM"))
  symbol timestamp             ask ask_size ask_exch   bid bid_size bid_exch cond tape
1 TSM    2024-10-31 15:11:40 190.5        5 V        188.1        1 V        R    A
2 ARM    2024-10-31 15:11:41 145.1        1 V        139          1 V        R    C
3 AVGO   2024-10-31 15:11:40 170.2        1 V        168.8        1 V        ?    C
4 F      2024-10-31 15:11:39  10.3      181 V         10.3       34 V        R    A
5 TSLA   2024-10-31 15:11:40 251.3        1 V        251.2        1 V        R    C
6 QCOM   2024-10-31 15:11:36 179          1 V        163          1 V        R    C

The quotes data contain the following fields:

  • symbol
  • timestamp
  • ask — the ask price
  • ask_size — the number of lots available at the ask price
  • ask_exch — the exchange ID (for the exchange that provided the ask price)
  • bid — the bid price
  • bid_size — the number of lots available at the bid price
  • bid_exch — the exchange ID (for the exchange that provided the bid price)
  • cond — any special conditions associated with the quote and
  • tape — the consolidated tape category.

See below for more information on exchange IDs and condition codes. For most stocks the lot size is 100.

Exchange Codes

To understand the exchange codes you can retrieve a lookup table:

exchange_codes()
   code                              name
1     A              NYSE American (AMEX)
2     B                     NASDAQ OMX BX
3     C           National Stock Exchange
4     D                         FINRA ADF
5     E                Market Independent
6     H                              MIAX
7     I International Securities Exchange
8     J                         Cboe EDGA
9     K                         Cboe EDGX
10    L          Long Term Stock Exchange
11    M            Chicago Stock Exchange
12    N           New York Stock Exchange
13    P                         NYSE Arca
14    Q                        NASDAQ OMX
15    S                  NASDAQ Small Cap
16    T                        NASDAQ Int
17    U                  Members Exchange
18    V                               IEX
19    W                              CBOE
20    X                    NASDAQ OMX PSX
21    Y                          Cboe BYX
22    Z                           Cboe BZ

💡 Since these data are static, the results from the exchange_codes() function are cached. There’s no overhead associated with multiple calls.

Looking at the latest quotes data above we can see that all of those data come from the IEX exchange. Incidentally, this is the exchange covered in detail in Michael Lewis’s “Flash Boys”.

Condition Codes

You can also get an explanation of the condition codes. The codes vary according to the value of tape. For example, here are the codes for tape category C:

condition_codes("C")
   code                        meaning
1     4    On Demand Intra Day Auction
2     A       Manual Ask Automated Bid
3     B       Manual Bid Automated Ask
4     F                   Fast Trading
5     H             Manual Bid And Ask
6     I                Order Imbalance
7     L                   Closed Quote
8     N                 Non Firm Quote
9     O        Opening Quote Automated
10    R         Regular Two Sided Open
11    U    Manual Bid And Ask Non Firm
12    X                   Order Influx
13    Y No Offer No Bid One Sided Open
14    Z              No Open No Resume

💡 These results are also cached.

The latest quotes data above mostly have condition code R, which indicates that they represent “Regular Two Sided Open” orders.

Historical Quotes

You can also get historical quotes data via the quotes_history() function.

aapl <- quotes_history("AAPL", "2024-06-21 19:59:30", "2024-06-21 20:00:30")

That can produce a lot of data, so let’s just look at just the first few records. I’ll omit a few columns for clarity.

  timestamp              ask ask_size ask_exch    bid bid_size bid_exch
1 2024-06-21 19:59:30 207.96        6 Q        207.95       24 Q
2 2024-06-21 19:59:30 207.96        1 P        207.95       24 Q
3 2024-06-21 19:59:30 207.96        1 P        207.95       25 Q
4 2024-06-21 19:59:30 207.97        2 Q        207.95       25 Q
5 2024-06-21 19:59:30 207.97        2 P        207.95       26 Q
6 2024-06-21 19:59:30 207.96        9 Q        207.95       26 Q

Those quotes now come from a variety of exchanges. See section below on data feeds.

I’m not sure about the best way to present these data. Below is an example of something that I’m experimenting with, which shows the individual quotes as points. The vertical axis reflects price. The points have 25% opacity, so multiple quotes will build up the colour. The size of the points scales with the quote size. The side is encoded by colour, with bid in red and ask and in blue. I added a quotes_tidy() utility function that will transform the data into a form suitable for generating this plot.

Historical quotes for AAPL with duration 1 minute.

Something interesting definitely seems to have occurred just before the market closed. Let’s zoom in on that.

Historical quotes for AAPL with duration 15 seconds.

The bid sizes kick off around 19:59:55. Sudden selling pressure drives the price down before the market closes. After the close the price swiftly recovers.

Let’s delve a bit deeper into this event. Below are the volume and trades for that day at time various resolutions. The volume is scaled down by a million and the trades by a thousand. Starting with hourly bars for the full day.

Comparing volume and number of trades using 1 hour bars over a number of hours.

The largest number of trades occurs between 18:00 and 19:00. However, these were all relatively small trades on average. The last hour of the trading day, between 19:00 and 20:00, was dominated by a smaller number of very large trades, as indicated by the massive jump in volume.

Now zoom in to 5 minute bars centred on the NYSE close of trade (20:00 UTC).

Comparing volume and number of trades using 5 minute bars over two hours.

A similar pattern emerges, with the number of trades escalating until the 19:50 bar, before dropping in the 19:55 bar, where the volume peaks.

Zooming even further, down to 1 minute resolution.

Comparing volume and number of trades using 1 minute bars over 30 minutes.

Here we can see that the last minute of the trading day was dominated by a few large trades, in keeping with the historical quotes plot above.

Data Feeds

By default the latest quotes are retrieved from the IEX exchange. If you have a SIP subscription then you can use the feed argument to change the data source. The data from SIP is more extensive than that from IEX.

Historical quotes all come from SIP, so unless you have a SIP subscription you really only lose out on more detailed data for the latest quotes.