NetNut Proxies
In this post I’ll be testing the proxy service provided by NetNut. For a bit of context take a look at my What is a Proxy? post.
Read More →In this post I’ll be testing the proxy service provided by NetNut. For a bit of context take a look at my What is a Proxy? post.
Read More →A proxy is a server or software that acts as an intermediary between a client (often a web browser) and one or more servers, typically on the internet. Proxies are used for a variety of purposes, including improving security, enhancing privacy, managing network traffic, and bypassing restrictions.
Read More →I recently migrated this blog from GitLab Pages to Vercel. There were two main reasons for the move:
For a side project I needed to scrape data for the NYSE Composite Index going back as far as possible.
Read More →In a previous post I looked at retrieving a list of assets from the Alpaca API using the {alpacar}
R package. Now we’ll explore how to retrieve historical and current price data.
How to list assets available to trade via the Alpaca API using the {alpacar}
R package.
The {alpacar}
package for R is a wrapper around the Alpaca API. API documentation can be found here. In this introductory post I show how to install and load the package, then authenticate with the API and retrieve account information.
A few days ago I wrote about a scraper for gathering economic calendar data. Well, I’m back again to write about another aspect of the same project: acquiring earnings calendar data.
Read More →Avoiding data duplication is a persistent challenge with acquiring data from websites or APIs. You can try to brute force it: pull the data again and then compare it locally to establish whether it’s fresh or stale. But there are other approaches that, if supported, can make this a lot simpler.
Read More →If you use Selenium for browser automation then at some stage you are likely to need to download a file by clicking a button or link on a website. Sometimes this just works. Other times it doesn’t.
Read More →I needed an offline copy of an economic calendar with all of the major international economic events. After grubbing around the internet I found the Economic Calendar on Myfxbook which had everything that I needed.
Read More →A few months ago I listened to an episode on the Founder’s Journal podcast that reviewed an essay, The Opportunity Cost of Everything, by Jack Raines. If you haven’t read it, then I suggest you invest 10 minutes in doing so. It will be time well spent.
Read More →Cloudflare is a service that aims improve the performance and security of websites. It operates as a content delivery network (CDN) to ensure faster load times and consequently better user experience. However, it also protects against online threats by filtering “malicious” traffic.
Web scraping requests are often deemed to be malicious (certainly by Cloudflare!) and thus blocked. There are various approaches to circumventing this, most of which involve running a live browser instance. For some applications though, this is a bit hammer for a small nail. The cloudscraper
package provides a lightweight option for dealing with Cloudflare and has an API similar to the requests
package.
cURL is the ultimate Swiss Army Knife for interacting with network protocols. But to be honest, I really only scratch the surface of what’s possible. Usually my workflow is something like this:
I’m going to take a look at my favourite online tool for converting a cURL command to code and then see what other tools there, focusing on Python and R as target languages.
Read More →The Big Book of R provides a comprehensive and ever-growing overview of a broad selection of R programming books. It was created and is maintained by Oscar Baruffa. The collection began with approximately 100 books and, with the help of contributions from the R community, has subsequently expanded to over 400. The books are grouped into topics such as geospatial, machine learning, statistics, text analysis, and many more. The Big Book of R is an excellent resource for anyone learning R programming, whether they are a beginner or advanced user.
Read More →The ability to specify a message ID in emails sent from the {emayili}
package makes it possible to create email threads.
A new minor version of the openai-python
package was released late on Friday 7 June 2024, only a couple of days after the last minor release. This release adds a chunking_strategy
argument to the methods for adding files to vector stores.
Quick notes on installing Docker on various platforms:
Read More →This question on Stack Overflow was a fun challenge: extract the markers off an embedded Google Map.
Read More →The R version of my Desert Island Docker talk. Similar idea to Desert Island Docker: Python Edition.
Read More →