At Fathom Data we have a number of workflows that require us to share various bits of data for a short time. The data are not sensitive, so we can freely share them. We have been doing this manually via platforms like Google Drive, Box or Dropbox. However we need to remember to go back and delete the file some time later. This is not ideal. What we needed was a simple “fire and forget” solution which would allow us to share the files and they would disappear automatically after some time. Well, this is precisely what Filebin does.
Filebin allows you to upload and share a file. The file can then be deleted at any time and, if not manually deleted, then will be automatically removed after 6 days.
{filebin}
R Package
There’s a neat Filebin API, so I built a little wrapper package, {filebin}
, which allows direct access from R.
Install the package.
remotes::install_github("datawookie/filebin")
Load the package and check the version.
library(filebin)
packageVersion("filebin")
[1] ‘0.0.3’
Posting a File
I’ve got copies of a selection of Open Source licenses.
licenses
[1] "license-AGPL-3.md" "license-apache-2.md" "license-cc0.md" "license-ccby-4.md" "license-GPL-2.md" "license-GPL-3.md" "license-LGPL-2.1.md"
[8] "license-LGPL-3.md" "license-mit.md"
Let’s upload the LGPL to Filebin.
lgpl <- post("license-LGPL-3.md")
str(lgpl)
tibble [1 × 9] (S3: tbl_df/tbl/data.frame)
$ url : chr "https://filebin.net/d4i1rhv6ic6kl8fz/license-LGPL-3.md"
$ bin : chr "d4i1rhv6ic6kl8fz"
$ filename : chr "license-LGPL-3.md"
$ content_type: chr "text/plain; charset=utf-8"
$ bytes : int 7560
$ md5 : chr "YzE2MGRkNDE3YzEyM2RhZmY3YTYyODUyNzYxZDg3MDY="
$ sha256 : chr "446e755fae55ff034bbb21be44670b5f116c2b2667947e7036f2bfe6632539a8"
$ created : chr "2021-11-18T07:25:11.148268Z"
$ updated : chr "2021-11-18T07:25:11.148268Z"
The result contains the following fields:
url
— the URL at which the file can be accessedbin
— the bin containing the filefilename
— the file namecontent_type
— the inferred MIME type of the filebytes
— the file sizemd5
— the MD5 checksumsha256
— the SHA256 hashcreated
— the time at which the file was uploadedupdated
— the time at which it was updated (or created if not updated).
The MD5 checksum is Base64 encoded.
md5sum("license-LGPL-3.md") %>% charToRaw() %>% base64enc::base64encode()
[1] "YzE2MGRkNDE3YzEyM2RhZmY3YTYyODUyNzYxZDg3MDY="
Bins
Files are organised into bins, which are analogous to folders or directories. By default the name of the bin is a random selection of text characters (see output above, where the bin name is d4i1rhv6ic6kl8fz
). However, you can use the bin
argument to specify a bin name.
gpl <- post("license-GPL-3.md", bin = "licenses")
str(gpl)
tibble [1 × 9] (S3: tbl_df/tbl/data.frame)
$ url : chr "https://filebin.net/licenses/license-GPL-3.md"
$ bin : chr "licenses"
$ filename : chr "license-GPL-3.md"
$ content_type: chr "text/plain; charset=utf-8"
$ bytes : int 34904
$ md5 : chr "MjlhOTAxMjk0MWE2YmNiMjZiZjBmYjQzODJjNWRkNzU="
$ sha256 : chr "585e25ef8f5946a52bf2aed68d5becfc38be94a8663aa01c1b31d88aa57f1de3"
$ updated : chr "2021-11-18T07:28:21.876551Z"
$ created : chr "2021-11-18T07:28:21.876551Z"
Multiple Files
You can simultaneously upload multiple files.
post(c(
"license-AGPL-3.md",
"license-GPL-2.md",
"license-GPL-3.md",
"license-LGPL-2.1.md",
"license-LGPL-3.md"
)) %>% select(url, created, updated)
# A tibble: 5 × 3
url created updated
<chr> <chr> <chr>
1 https://filebin.net/87ve2dy4mif2ci9v/license-AGPL-3.md 2021-11-18T07:49:22.51542Z 2021-11-18T07:49:22.51542Z
2 https://filebin.net/87ve2dy4mif2ci9v/license-GPL-2.md 2021-11-18T07:49:23.31774Z 2021-11-18T07:49:23.31774Z
3 https://filebin.net/87ve2dy4mif2ci9v/license-GPL-3.md 2021-11-18T07:49:23.54684Z 2021-11-18T07:49:23.54684Z
4 https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-2.1.md 2021-11-18T07:49:24.14585Z 2021-11-18T07:49:24.14585Z
5 https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-3.md 2021-11-18T07:49:26.82930Z 2021-11-18T07:49:26.82930Z
When you upload multiple files they all end up in the same bin. Each file is uploaded sequentially and assigned a created
and updated
time.
Updating a File
You can update an existing file. In order to update a file rather than simply create a new upload, you need to specify the bin
of the existing upload.
post("license-LGPL-2.1.md", bin = "87ve2dy4mif2ci9v") %>% select(url, created, updated)
# A tibble: 1 × 3
url created updated
<chr> <chr> <chr>
1 https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-2.1.md 2021-11-18T07:49:24.14585Z 2021-11-18T07:50:57.52473Z
The created
time is consistent with the original time that the file was uploaded (see above), but the updated
time has been modified.
Retrieving a File
You can share either the url
or filename
and bin
. The file can then either be downloaded via a browser, on the command line using curl
or wget
, or in R. Of course we are interested in the last option.
# Retrieve file using URL.
#
file_get("https://filebin.net/87ve2dy4mif2ci9v/license-LGPL-2.1.md")
# Retrieve file using filename and bin.
#
file_get(
"license-LGPL-2.1.md",
"87ve2dy4mif2ci9v",
overwrite = TRUE
)
In the second call to file_get()
we need to specify the overwrite
option so that the second download overwrites the result of the first download.
Checking on a Bin
We can interrogate a bin using the bin_get()
function.
licenses <- bin_get("87ve2dy4mif2ci9v")
The result is a list with two components, bin
and files
. The bin
component includes the number of files and the total size. The readonly
field indicates whether the bin has been locked for further updates.
str(licenses$bin)
tibble [1 × 7] (S3: tbl_df/tbl/data.frame)
$ id : chr "87ve2dy4mif2ci9v"
$ readonly: logi FALSE
$ bytes : int 121022
$ files : int 5
$ updated : chr "2021-11-18T07:50:57.529291Z"
$ created : chr "2021-11-18T07:49:22.39557Z"
$ expired : chr "2021-11-25T07:50:57.52929Z"
The files
component has the details of each of the files in the bin.
licenses$files %>% select(filename, content_type, bytes, md5)
# A tibble: 5 × 4
filename content_type bytes md5
<chr> <chr> <int> <chr>
1 license-AGPL-3.md text/plain; charset=utf-8 34303 ZmIwMTYyNWVmMDE5NzM0OTBiY2Y0ZWZiOWFkZTIzYWU=
2 license-GPL-2.md text/plain; charset=utf-8 17941 M2Q4Mjc4MGU4OTE3YjM2MGNiZWU3YjllYzNlNDA3MzQ=
3 license-GPL-3.md text/plain; charset=utf-8 34904 MjlhOTAxMjk0MWE2YmNiMjZiZjBmYjQzODJjNWRkNzU=
4 license-LGPL-2.1.md text/plain; charset=utf-8 26314 OGY1MTA3ZDk4NzU3NzExZWNjMWIwN2FjMzM4Nzc1NjQ=
5 license-LGPL-3.md text/plain; charset=utf-8 7560 YzE2MGRkNDE3YzEyM2RhZmY3YTYyODUyNzYxZDg3MDY=
Locking a Bin
It’s possible to lock a bin, making it read only. Once locked, a bin will not accept new file uploads nor updates of existing files.
bin_lock("87ve2dy4mif2ci9v")
If we check back on the readonly
field for this bin we find that it’s now TRUE
.
bin_get("87ve2dy4mif2ci9v")$bin$readonly
[1] TRUE
Bin QR Code
A QR code is a handy way to share content. You can generate a QR code pointing to a bin as a PNG copy of with bin_qr_code()
.
bin_qr_code("87ve2dy4mif2ci9v")
[1] "87ve2dy4mif2ci9v.png"
Try it out. If you scan this code you’ll get a URL which will not be valid after 25 November 2021 when it expires.
Deleting
You can delete individual files with file_delete()
and whole bins with bin_delete()
. Note: It’s still possible to delete a locked bin.
Conclusion
We’re going to be integrating the {filebin}
package into a number of existing workflows.