added files

56c78e3c · Niels Cautaerts · e438e27e · 56c78e3c · 56c78e3c · 56c78e3c
Commit 56c78e3c authored 4 years ago by Niels Cautaerts
--- a/.gitignore
+++ b/.gitignore
+tags
+.ipynb*
+data
--- a/3-git-instructions.md
+++ b/3-git-instructions.md
+# Basics of working with git
+Git is a version control system.
+By creating "commits" we can save specific versions of files and return to those different points in time whenever we want.
+There's a LOT more to git but for our purposes now this is enough.
+## Steps
+### Creating a git repo
+* In the jupyter environment open a Terminal from the New dropdown menu
+* In the terminal create a new folder with `mkdir name-of-project`. This folder will be our git repository and we will keep track of the history of everything we put inside.
+* Copy the analysis notebook, the binder folder and the docker file inside with the command `cp <what you want to copy> name-of-project`
+* From the jupyter file browser, create a file called `README.md` in this folder and add some information about the project. This will be displayed when you upload to github.
+* In the terminal change directories into the folder with `cd name-of-project`.
+* Initialize a git repository with `git init`
+* Initialize your credentials:
+```
+$ git config --global user.email "YOUR EMAIL"
+$ git config --global user.name "YOUR NAME"
+```
+### Committing files to history
+**Important: you should clear all output of notebooks and never commit large data files to git history. Git is made for keeping track of text files (under the hood Jupyter notebooks are still text files)**
+* Check the status of files with `git status`
+* Add all files to the staging area to be committed to history with `git add .`. If you don't want to add all files you should instead o f `.` use file paths
+* Create a commit with `git commit -m "some commit message like git commit"`
+* Your files in their current state are now committed to history. If you make changes you can always return to this state.
+### Publishing to Github
+* First you need to create access credentials. Create an ssh key with the command `ssh-keygen` and press enter until you come to a new prompt. Don't add passwords.
+* Now you need to copy the public key to github. Log into your Github account, click on your avatar in the top right, then go to settings, then SSH and GPG keys. Click on new ssh key.
+* Go back to the terminal and type `cat ~/.ssh/id_rsa.pub`. Copy the output to the key field on Github, give it a name and add the key.
+* Create a new repository on github. You will see instructions there. Add the repository as a remote with `git remote add origin git@github.com...`. 
+* Push your repository up to github with `git push -u origin master`. Refresh the github page to see your repo is now on github.
+* Whenever you create commits locally and push, all those versions will be available on github.
--- a/4-mybinder-instructions.md
+++ b/4-mybinder-instructions.md
+* Ensure you have copied the binder folder into your repository and that it is on Github.
+  * if not, copy it into the folder, add, commit and push.
+* Go to <https://mybinder.org/> and add the link to your repository in the box. copy the markdown link to the badge and copy it into your README.md file.
+* add, commit and push. Now you should be able to click the link and build and run the environment.
--- a/5-docker-instructions.md
+++ b/5-docker-instructions.md
+* check out the docker file
+* check out the github actions workflow
+* make sure it is available on your github repo. If not, add, commit, push.
+* create an account on dockerhub. Go to account settings and create a security access token. Copy it to a safe place, it will only be shown once.
+* Add this under your Github repository secrets as `DOCKERHUB_TOKEN`. Also add another secret `DOCKERHUB_USERNAME` to enter your dockerhub username.
+* Modify the file `.github/workflows/build_docker_image.yaml` as instructed, adding your username and repository name in the right place.
+* add, commit, push. Then create a release on github. See what happens under actions. When the build process is done, you can check your images on dockerhub.
+* if you have docker locally installed on your computer you can download and run your image with the following command: `docker run -p 7000:8888 YOURUSERNAME/example_TEM_analysis:latest`. You can then visit `http://localhost:7000` to view the notebook.
--- a/Elab.ipynb
+++ b/Elab.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "8d76e8b8",
+   "metadata": {},
+   "source": [
+    "# Keeping track of experimental workflows with electronic lab notebook eLabFTW\n",
+    "[eLabFTW](https://www.elabftw.net) is a popular open source electronic lab notebook application that you can use as a powerful journaling application to keep digital records of your lab activities and objects.\n",
+    "I personally use it to document all my sessions at the microscope and create records of samples.\n",
+    "The advantages are:\n",
+    "\n",
+    "* Data about experiments available from everywhere on any device\n",
+    "* Create clickable links between different items, for example samples and experiments, or experiments and the data files\n",
+    "* Add pictures, tables, data, drawings, descriptions all in one place\n",
+    "* Can send links to others, or use links to digital records on sample boxes in QR codes\n",
+    "* Python API to programmatically query/update database\n",
+    "\n",
+    "We have set up a temporary eLabFTW instance at \n",
+    "#### <https://elabftwdemo.esc.mpcdf.mpg.de/login.php>\n",
+    "Check out the site as an anonymous visitor; the database on this dummy instance is public.\n",
+    "With a real account you could use this interface to add/update database items and experiments."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d1565d7c",
+   "metadata": {},
+   "source": [
+    "### The Python REST API\n",
+    "The real power of eLabFTW is its python API, which allows us to query information from the database and integrate it with other programs, or send information to the database from other applications.\n",
+    "This we will do in this notebook.\n",
+    "\n",
+    "First we set up a connection to the API with a token.\n",
+    "I created this token for a dummy user and it has read + write access, but in principle should not be able to delete any of the items already in the database."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e7e29eb4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# setting up the connection to the elab server with the elabapy package\n",
+    "import elabapy\n",
+    "URL = \"https://elabftwdemo.esc.mpcdf.mpg.de/\"\n",
+    "TOKEN = \"028910cbb11c2af9a592ecea958e061589990094a69ffffd0e3dd494e440c017beff7bedafd105e6074c\"  # this is a read and write token\n",
+    "ENDPOINT = URL+\"api/v1/\"\n",
+    "manager = elabapy.Manager(endpoint=ENDPOINT, token=TOKEN, verify=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2b06499a",
+   "metadata": {},
+   "source": [
+    "Here we query all the data from the database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "33354fbd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"------------\")\n",
+    "print(\"Experiments:\")\n",
+    "print(\"------------\")\n",
+    "for i in manager.get_all_experiments():\n",
+    "    print(i[\"id\"], i[\"title\"])\n",
+    "print(\"------------\")\n",
+    "print(\"Items:\")\n",
+    "print(\"------------\")\n",
+    "for i in manager.get_all_items():\n",
+    "    print(i[\"id\"], i[\"title\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "87fe0daf",
+   "metadata": {},
+   "source": [
+    "Here we get one specific item out.\n",
+    "The response is a JSON string which captures all the known information about that item in the database.\n",
+    "We are especially interested in the \"body\" which is the text information we can read, and any uploaded items."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6db700b2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "john = manager.get_item(1)\n",
+    "john"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e523b575",
+   "metadata": {},
+   "source": [
+    "We can display the body of the item directly in the notebook using IPython's rendering functions. However we do need to replace links in the body (which are relative) to absolute links with the following helper function."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e103c9a5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import re\n",
+    "\n",
+    "def render_links(html, url):\n",
+    "    \"\"\"\n",
+    "    Helper function to return a string with absolute links to images, database items and experiments for rendering\n",
+    "    the body of a database item in the notebook\n",
+    "    \"\"\"\n",
+    "    pattern_link = r'(<a href=\")(.*)&amp;(id=[0-9]+\">)'\n",
+    "    conversion_link = lambda x: x.group(1) + url + x.group(2) + \"&\" + x.group(3)\n",
+    "    pattern_img = r'(<img src=\")(.*)(\" \\/>)'\n",
+    "    conversion_img = lambda x: x.group(1) + url + x.group(2) + x.group(3)\n",
+    "    updated_links = re.sub(pattern_link, conversion_link, html)\n",
+    "    return re.sub(pattern_img, conversion_img, updated_links)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "83aebd03",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import IPython"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b3cdff86",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "IPython.display.HTML(render_links(john[\"body\"], URL))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bf10986c",
+   "metadata": {},
+   "source": [
+    "Here is the item corresponding to an experimental sample. The links are clickable and refer us to the right page on the elab website (after we log in)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9d567371",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "FIB_sample = manager.get_item(4)\n",
+    "IPython.display.HTML(render_links(FIB_sample[\"body\"], URL))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4478e7b3",
+   "metadata": {},
+   "source": [
+    "We can take this a step further and instead of just displaying what is on the page in the notebook, we can try to parse it so a computer might be able to do something with the information.\n",
+    "The body of each item is plain HTML, so it can be parsed with web scraping packages like beautifulsoup.\n",
+    "\n",
+    "Let's build a little tool that will extract all the hyperlinks from the text and separate them by section."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d5ac31af",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from bs4 import BeautifulSoup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5daecd1e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def parse_body(html, divider=\"h1\", children={\"p\", \"ol\", \"ul\", \"h1\"}):\n",
+    "    \"\"\"Returns a dictionary with information divided by heading, links extracted\"\"\"\n",
+    "    parsed = BeautifulSoup(html, \"html.parser\")\n",
+    "    dictionary = {}\n",
+    "    for i in parsed.find_all(divider, recursive=False):\n",
+    "        dictionary[i.string] = {}\n",
+    "        subdict = dictionary[i.string]\n",
+    "        subdict[\"Contents\"] = []\n",
+    "        subdict[\"Links\"] = {}\n",
+    "        subdict[\"Image links\"] = []\n",
+    "        subdict[\"Linked database items\"] = []\n",
+    "        subdict[\"Linked experiments\"] = []\n",
+    "        k = i\n",
+    "        while True:\n",
+    "            k = k.find_next_sibling({\"p\", \"ol\", \"ul\", \"h1\"}, recursive=False)\n",
+    "            if k is None or k.name == divider:\n",
+    "                break\n",
+    "            else:\n",
+    "                lnks = k.find_all(\"a\", recursive=True)\n",
+    "                for lnk in lnks:\n",
+    "                    subdict[\"Links\"][lnk[\"href\"]] = lnk.string\n",
+    "                    db_item = re.compile(r\"database\\.php\\?.*id\\=([0-9]+)\").search(lnk[\"href\"])\n",
+    "                    exp_item = re.compile(r\"experiments\\.php\\?.*id\\=([0-9]+)\").search(lnk[\"href\"])\n",
+    "                    if db_item:\n",
+    "                        subdict[\"Linked database items\"].append(int(db_item.groups()[0]))\n",
+    "                    if exp_item:\n",
+    "                        subdict[\"Linked experiments\"].append(int(exp_item.groups()[0]))\n",
+    "                imlks = k.find_all(\"img\", recursive=True)\n",
+    "                for im in imlks:\n",
+    "                    subdict[\"Image links\"].append(im[\"src\"])\n",
+    "                subdict[\"Contents\"].append(k.__repr__())\n",
+    "    return dictionary"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "903b4225",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def pretty_print_parsed(parsed_dict):\n",
+    "    for key, subdict in parsed_dict.items():\n",
+    "        print(key)\n",
+    "        print(\"-\"*len(key))\n",
+    "        care = [\"Links\", \"Image links\", \"Linked database items\", \"Linked experiments\"]\n",
+    "        for j in care:\n",
+    "            if subdict[j]:\n",
+    "                print(\"> \", j)\n",
+    "                for i in subdict[j]:\n",
+    "                    print(\"  > \", i)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e5dc3c38",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "parsed_fib = parse_body(render_links(FIB_sample[\"body\"], URL))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bd58bcfc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pretty_print_parsed(parsed_fib)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "487d629c",
+   "metadata": {},
+   "source": [
+    "We can also query the info on the experimental session to get access to the links where the actual data is stored. We can do a manual download or parse the html to get the URL as a string."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "07f9b08c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "experiment = manager.get_experiment(1)\n",
+    "IPython.display.HTML(render_links(experiment[\"body\"], URL))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8c8de798",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "parsed_experiment = parse_body(render_links(experiment[\"body\"], URL))\n",
+    "data_file_link = list(parsed_experiment[\"Data files\"][\"Links\"].values())[0]\n",
+    "print(data_file_link)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8fcd1fbc",
+   "metadata": {},
+   "source": [
+    "Such tools would allow us to visualize all iterrelationships between database items and allow us to write programs to crawl the database.\n",
+    "We will turn our attention to adding things to the database.\n",
+    "\n",
+    "### Adding items to the database\n",
+    "Try it yourself, then try to query your item or experiment back. \n",
+    "To check what is possible, see the documentation <https://doc.elabftw.net/api/#api-Entity>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "add9b29d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "response = manager.create_item(1)\n",
+    "print(f\"Created item with id {response['id']}.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9bf80bf8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "params = { \"title\": \"Database item\", \"date\": \"20200504\", \"body\": \"Created from the API\", \"category\": \"Sample\" }  # in the \"body\" entry you can add arbitrary HTML and CSS syntax for formatting\n",
+    "print(manager.post_item(5, params))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4f1aaff1",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
+%% Cell type:markdown id:8d76e8b8 tags:
+# Keeping track of experimental workflows with electronic lab notebook eLabFTW
+[eLabFTW](https://www.elabftw.net) is a popular open source electronic lab notebook application that you can use as a powerful journaling application to keep digital records of your lab activities and objects.
+I personally use it to document all my sessions at the microscope and create records of samples.
+The advantages are:
+* Data about experiments available from everywhere on any device
+* Create clickable links between different items, for example samples and experiments, or experiments and the data files
+* Add pictures, tables, data, drawings, descriptions all in one place
+* Can send links to others, or use links to digital records on sample boxes in QR codes
+* Python API to programmatically query/update database
+We have set up a temporary eLabFTW instance at
+#### <https://elabftwdemo.esc.mpcdf.mpg.de/login.php>
+Check out the site as an anonymous visitor; the database on this dummy instance is public.
+With a real account you could use this interface to add/update database items and experiments.
+%% Cell type:markdown id:d1565d7c tags:
+### The Python REST API
+The real power of eLabFTW is its python API, which allows us to query information from the database and integrate it with other programs, or send information to the database from other applications.
+This we will do in this notebook.
+First we set up a connection to the API with a token.
+I created this token for a dummy user and it has read + write access, but in principle should not be able to delete any of the items already in the database.
+%% Cell type:code id:e7e29eb4 tags:
+``` python
+# setting up the connection to the elab server with the elabapy package
+import elabapy
+URL = "https://elabftwdemo.esc.mpcdf.mpg.de/"
+TOKEN = "028910cbb11c2af9a592ecea958e061589990094a69ffffd0e3dd494e440c017beff7bedafd105e6074c"  # this is a read and write token
+ENDPOINT = URL+"api/v1/"
+manager = elabapy.Manager(endpoint=ENDPOINT, token=TOKEN, verify=True)
+```
+%% Cell type:markdown id:2b06499a tags:
+Here we query all the data from the database
+%% Cell type:code id:33354fbd tags:
+``` python
+print("------------")
+print("Experiments:")
+print("------------")
+for i in manager.get_all_experiments():
+    print(i["id"], i["title"])
+print("------------")
+print("Items:")
+print("------------")
+for i in manager.get_all_items():
+    print(i["id"], i["title"])
+```
+%% Cell type:markdown id:87fe0daf tags:
+Here we get one specific item out.
+The response is a JSON string which captures all the known information about that item in the database.
+We are especially interested in the "body" which is the text information we can read, and any uploaded items.
+%% Cell type:code id:6db700b2 tags:
+``` python
+john = manager.get_item(1)
+john
+```
+%% Cell type:markdown id:e523b575 tags:
+We can display the body of the item directly in the notebook using IPython's rendering functions. However we do need to replace links in the body (which are relative) to absolute links with the following helper function.
+%% Cell type:code id:e103c9a5 tags:
+``` python
+import re
+def render_links(html, url):
+    """
+    Helper function to return a string with absolute links to images, database items and experiments for rendering
+    the body of a database item in the notebook
+    """
+    pattern_link = r'(<a href=")(.*)&amp;(id=[0-9]+">)'
+    conversion_link = lambda x: x.group(1) + url + x.group(2) + "&" + x.group(3)
+    pattern_img = r'(<img src=")(.*)(" \/>)'
+    conversion_img = lambda x: x.group(1) + url + x.group(2) + x.group(3)
+    updated_links = re.sub(pattern_link, conversion_link, html)
+    return re.sub(pattern_img, conversion_img, updated_links)
+```
+%% Cell type:code id:83aebd03 tags:
+``` python
+import IPython
+```
+%% Cell type:code id:b3cdff86 tags:
+``` python
+IPython.display.HTML(render_links(john["body"], URL))
+```
+%% Cell type:markdown id:bf10986c tags:
+Here is the item corresponding to an experimental sample. The links are clickable and refer us to the right page on the elab website (after we log in).
+%% Cell type:code id:9d567371 tags:
+``` python
+FIB_sample = manager.get_item(4)
+IPython.display.HTML(render_links(FIB_sample["body"], URL))
+```
+%% Cell type:markdown id:4478e7b3 tags:
+We can take this a step further and instead of just displaying what is on the page in the notebook, we can try to parse it so a computer might be able to do something with the information.
+The body of each item is plain HTML, so it can be parsed with web scraping packages like beautifulsoup.
+Let's build a little tool that will extract all the hyperlinks from the text and separate them by section.
+%% Cell type:code id:d5ac31af tags:
+``` python
+from bs4 import BeautifulSoup
+```
+%% Cell type:code id:5daecd1e tags:
+``` python
+def parse_body(html, divider="h1", children={"p", "ol", "ul", "h1"}):
+    """Returns a dictionary with information divided by heading, links extracted"""
+    parsed = BeautifulSoup(html, "html.parser")
+    dictionary = {}
+    for i in parsed.find_all(divider, recursive=False):
+        dictionary[i.string] = {}
+        subdict = dictionary[i.string]
+        subdict["Contents"] = []
+        subdict["Links"] = {}
+        subdict["Image links"] = []
+        subdict["Linked database items"] = []
+        subdict["Linked experiments"] = []
+        k = i
+        while True:
+            k = k.find_next_sibling({"p", "ol", "ul", "h1"}, recursive=False)
+            if k is None or k.name == divider:
+                break
+            else:
+                lnks = k.find_all("a", recursive=True)
+                for lnk in lnks:
+                    subdict["Links"][lnk["href"]] = lnk.string
+                    db_item = re.compile(r"database\.php\?.*id\=([0-9]+)").search(lnk["href"])
+                    exp_item = re.compile(r"experiments\.php\?.*id\=([0-9]+)").search(lnk["href"])
+                    if db_item:
+                        subdict["Linked database items"].append(int(db_item.groups()[0]))
+                    if exp_item:
+                        subdict["Linked experiments"].append(int(exp_item.groups()[0]))
+                imlks = k.find_all("img", recursive=True)
+                for im in imlks:
+                    subdict["Image links"].append(im["src"])
+                subdict["Contents"].append(k.__repr__())
+    return dictionary
+```
+%% Cell type:code id:903b4225 tags:
+``` python
+def pretty_print_parsed(parsed_dict):
+    for key, subdict in parsed_dict.items():
+        print(key)
+        print("-"*len(key))
+        care = ["Links", "Image links", "Linked database items", "Linked experiments"]
+        for j in care:
+            if subdict[j]:
+                print("> ", j)
+                for i in subdict[j]:
+                    print("  > ", i)
+```
+%% Cell type:code id:e5dc3c38 tags:
+``` python
+parsed_fib = parse_body(render_links(FIB_sample["body"], URL))
+```
+%% Cell type:code id:bd58bcfc tags:
+``` python
+pretty_print_parsed(parsed_fib)
+```
+%% Cell type:markdown id:487d629c tags:
+We can also query the info on the experimental session to get access to the links where the actual data is stored. We can do a manual download or parse the html to get the URL as a string.
+%% Cell type:code id:07f9b08c tags:
+``` python
+experiment = manager.get_experiment(1)
+IPython.display.HTML(render_links(experiment["body"], URL))
+```
+%% Cell type:code id:8c8de798 tags:
+``` python
+parsed_experiment = parse_body(render_links(experiment["body"], URL))
+data_file_link = list(parsed_experiment["Data files"]["Links"].values())[0]
+print(data_file_link)
+```
+%% Cell type:markdown id:8fcd1fbc tags:
+Such tools would allow us to visualize all iterrelationships between database items and allow us to write programs to crawl the database.
+We will turn our attention to adding things to the database.
+### Adding items to the database
+Try it yourself, then try to query your item or experiment back.
+To check what is possible, see the documentation <https://doc.elabftw.net/api/#api-Entity>
+%% Cell type:code id:add9b29d tags:
+``` python
+response = manager.create_item(1)
+print(f"Created item with id {response['id']}.")
+```
+%% Cell type:code id:9bf80bf8 tags:
+``` python
+params = { "title": "Database item", "date": "20200504", "body": "Created from the API", "category": "Sample" }  # in the "body" entry you can add arbitrary HTML and CSS syntax for formatting
+print(manager.post_item(5, params))
+```
+%% Cell type:code id:4f1aaff1 tags:
+``` python
+```
--- a/README.md
+++ b/README.md
-# Binder setup for the tutorials in the Bigmax Summer School 2021
+# Example reproducible TEM data analysis
+## BigMax summer school 2021
-This repo defines the binder setup that is to be used in the tutorials.
+##### Niels Cautaerts
+##### last updated: 9/9/2021
+This example walks through some techniques whereby we can improve the reproducibility of our experimental workflow.
+We look at the following tools and techniques:
+### 1. Electronic lab notebook eLabFTW
+Digitizing our experimental workflow and the links between experiments, samples, other lab resources is crucial for being able to trace back our steps from results. 
+In addition to the web interface, we interact with the eLabFTW through Python in a Jupyter notebook.
+### 2. Jupyter notebook based analysis
+Jupyter notebooks are interactive worksheets in which we can write code to perform analysis and visualization of data.
+Here we demonstrate a short machine learning inspired analysis workflow of a high resolution STEM image.
+### 3. Git version control
+Jupyter notebooks are already much more reproducible than click based workflows in GUI programs.
+However they are prone to frequent updates and changes; how do you ensure everyone is looking at the same notebook?
+We use version control to create "save points" for our notebook everyone could go back to.
+We will use git to start version controlling our notebook, and publish on Github.
+### 4. MyBinder
+Even if everyone has the same version of the notebook, the results might not be reproducible because users have different versions of software packages installed on their system.
+One simple solution to this problem is to use a service like mybinder, which builds a jupyter environment from a predefined configuration file.
+We go over best practices for making such a configuration file.
+### 5. Docker
+Even if we pin versions of software with MyBinder it doesn't guarantee we will always get exactly the same environment.
+For example, the dependencies of the packages you need may not be pinned, and if those get updated things may still break.
+To ensure a completely reproducible environment you want to package EVERYTHING (data, jupyter notebook, software) together in a single image.
+This can be achieved with Docker.
+We write a Dockerfile, which instructs the `docker` software how to build this image.
+Redoing the build process with the same dockerfile may produce slightly different images, but the image itself is static and will always work in the same way.
+Here we show how we can build a docker image with Github's CI/CD service.
--- a/analysis.ipynb
+++ b/analysis.ipynb
--- a/apt.txt
+++ b/apt.txt
+vim
+nano
--- a/environment.yml
+++ b/environment.yml
-name: tutorial-env
+name: example
 channels:
  - conda-forge
 dependencies:
-  - python
+  - hyperspy=1.6.4
-  - scipy
+  - scikit-learn=0.24.2
-  - beautifulsoup4
+  - atomap=0.3.1
-  - requests
+  - beautifulsoup4=4.10.0
-  - scikit-learn
+  - elabapy=0.8.2
-  - scikit-image
-  - hyperspy
-  - atomap
-  - pip
-  - pip:
-    - elabapy
--- a/repository/.dockerignore
+++ b/repository/.dockerignore
+tags
+.ipynb_checkpoints
+.git
--- a/repository/.github/workflows/build_docker_image.yaml
+++ b/repository/.github/workflows/build_docker_image.yaml
+name: Build and push docker images to docker hub when we create a new tag
+on:
+  push:
+    tags:
+      - '*'
+jobs:
+  buildandpush:
+    runs-on: ubuntu-latest
+    steps:
+      # Change the USERNAME and REPONAME in the first step below!
+      - name: Get the latest tag
+        id: release
+        run: |
+          echo "::set-output name=releasetag::$(curl -s https://api.github.com/repos/USERNAME/REPONAME/releases/latest | jq '.tag_name' | sed 's/\"//g')"
+      - name: Checkout the repo on this tag
+        uses: actions/checkout@v2
+        with:
+          ref: ${{ steps.release.outputs.releasetag }}
+      # hardware emulation for different CPU architectures
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v1
+      # build system
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v1
+      - name: Login to DockerHub
+        uses: docker/login-action@v1 
+        with:
+          username: ${{ secrets.DOCKERHUB_USERNAME }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}
+      # images are pushed to the latest as well as specific version tag
+      - name: Build and push
+        id: docker_build
+        uses: docker/build-push-action@v2
+        with:
+          push: true
+          tags: |
+            ${{ secrets.DOCKERHUB_USERNAME }}/example_TEM_analysis:latest
+            ${{ secrets.DOCKERHUB_USERNAME }}/example_TEM_analysis:${{ steps.release.outputs.releasetag }}
--- a/repository/Dockerfile
+++ b/repository/Dockerfile
+# start from a base image providing conda
+FROM continuumio/miniconda3
+# add Tini
+ENV TINI_VERSION v0.19.0
+ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tini
+RUN chmod +x /tini
+ENTRYPOINT ["/tini", "--"]
+# install the necessary dependencies
+RUN conda install -c conda-forge hyperspy=1.6.4 scikit-learn=0.24.2 atomap=0.3.1 beautifulsoup4=4.10.0 elabapy=0.8.2
+# copy all the necessary files into the container
+RUN mkdir notebook && mkdir notebook/data
+WORKDIR notebook/
+COPY analysis.ipynb .
+RUN wget https://owncloud.gwdg.de/index.php/s/utJfj0388mp8W1S/download -O data/dataset.emd
+# command that runs when we spin up the container
+CMD ["jupyter", "notebook", "--port=8888", "--no-browser", "--ip=0.0.0.0", "--allow-root"]
--- a/repository/binder/environment.yml
+++ b/repository/binder/environment.yml
+name: example
+channels:
+  - conda-forge
+dependencies:
+  - hyperspy=1.6.4
+  - scikit-learn=0.24.2
+  - atomap=0.3.1
+  - beautifulsoup4=4.10.0
+  - elabapy=0.8.2
--- a/test_imports.ipynb
+++ b/test_imports.ipynb
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import sklearn"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import matplotlib"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import scipy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import skimage"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import hyperspy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import atomap"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import elabapy"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.7.8"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
-%% Cell type:code id: tags:
-``` python
-import sklearn
-```
-%% Cell type:code id: tags:
-``` python
-import matplotlib
-```
-%% Cell type:code id: tags:
-``` python
-import numpy
-```
-%% Cell type:code id: tags:
-``` python
-import scipy
-```
-%% Cell type:code id: tags:
-``` python
-import skimage
-```
-%% Cell type:code id: tags:
-``` python
-import hyperspy
-```
-%% Cell type:code id: tags:
-``` python
-import atomap
-```
-%% Cell type:code id: tags:
-``` python
-import elabapy
-```
-%% Cell type:code id: tags:
-``` python
-```