{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "setup_test_v031.ipynb", "provenance": [], "collapsed_sections": [ "9UOxkwGByHza" ], "authorship_tag": "ABX9TyNhsXFf0SqHMQQl4N2VxjZH", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "cG6SfDhdf1uV" }, "source": [ "# Dashboard\n", "\n", "[Dashboard](https://console.cloud.google.com/monitoring/dashboards/builder/broker-instance-v031?project=ardent-cycling-243415&dashboardBuilderState=%257B%2522editModeEnabled%2522:false%257D&startTime=20210412T052643-04:00&endTime=20210412T073043-04:00) showing a test run of the `v031` broker instance." ] }, { "cell_type": "markdown", "metadata": { "id": "C5UKvNFGx3lg" }, "source": [ "# Setup" ] }, { "cell_type": "code", "metadata": { "id": "ApRx7Z5_Uk7n" }, "source": [ "# Create a function to run and print a shell command.\n", "def run(cmd: str):\n", " print('>> {}'.format(cmd))\n", " !{cmd}\n", " print('')" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "kIvsQydOW78f" }, "source": [ "from google.colab import auth" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "ID2RldmVW7un" }, "source": [ "auth.authenticate_user()" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "98S4RrLWUSmt" }, "source": [ "# install pgb\n", "run('python3 -m pip install --index-url https://test.pypi.org/simple/ --no-deps pgb_utils')\n", "\n", "# install dependencies\n", "packages = ['apache-beam','google-apitools','APLpy',\n", " 'beautifulsoup4==4.8', # plot_cutouts grayscale stretch='arcsinh'\n", " 'astropy-healpix==0.6', # plot_cutouts grayscale stretch='arcsinh'\n", " 'astropy==3.2.1', # plot_cutouts grayscale stretch='arcsinh'\n", " ]\n", "for package in packages:\n", " run(f'pip install --quiet {package}')" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "d7U6RJTmTYPT" }, "source": [ "from google.cloud import bigquery\n", "import pgb_utils as pgb\n", "\n", "project = 'ardent-cycling-243415'\n", "dataset = 'ztf_alerts'" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "kU_nVge4Ye9e" }, "source": [ "# Connect your Google Drive file system\n", "drive.mount('/content/drive') # follow the instructions to authorize access\n", "colabpath = '/content/drive/MyDrive/Colab\\ Notebooks/PGB_dev'\n", "colabpath_noesc = '/content/drive/MyDrive/Colab Notebooks/PGB_dev'\n", "run(f'mkdir -p {colabpath}')" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "Z5YSJwN988_p" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "oFrT1cHd9XZc" }, "source": [ "# Create broker instance Dashboards" ] }, { "cell_type": "markdown", "metadata": { "id": "54NfRYmA9edQ" }, "source": [ "- [Use the Dashboard API to build your own monitoring dashboard](https://cloud.google.com/blog/products/management-tools/cloud-monitoring-dashboards-using-an-api)\n", "- [Managing dashboards by API](https://cloud.google.com/monitoring/dashboards/api-dashboard)" ] }, { "cell_type": "markdown", "metadata": { "id": "vuNlYXn69d6U" }, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": { "id": "_7i22PV68-2C" }, "source": [ "# DB Tables, Storage Buckets" ] }, { "cell_type": "markdown", "metadata": { "id": "vdr7dja0FGVa" }, "source": [ "## In `setup_broker.sh`, create BQ tables from json schemas\n", "\n", "The current way we create tables for a testing instance does not work. The `alerts` table structure is wrong. Instead, let's get schema json files and use them to create the tables.\n", "\n", "- [Copying a single source table](https://cloud.google.com/bigquery/docs/managing-tables#copying_a_single_source_table)" ] }, { "cell_type": "markdown", "metadata": { "id": "nNElEPY5OVGT" }, "source": [ "- download schemas using `bq show --schema --format=prettyjson ardent-cycling-243415:ztf_alerts.alerts > alerts_schema.json`, etc.\n", "- create tables using `bq mk --table ardent-cycling-243415:ztf_alerts_.alerts alerts_schema.json`, etc." ] }, { "cell_type": "markdown", "metadata": { "id": "SZj1bVnvS-53" }, "source": [ "## Create `ztf_alerts.DIASource` table\n", "\n", "We need to provide the schema when creating the table. Easiest way is to \n", "\n", "~[create it from a query result](https://cloud.google.com/bigquery/docs/tables#creating_a_table_from_a_query_result) on the alerts table.~\n", "\n", "~Taking the [candidate schema file from ZTF's repo](https://github.com/ZwickyTransientFacility/ztf-avro-alert/blob/master/schema/candidate.avsc) and altering it to get what we need~\n", "\n", "Download the alerts table schema, then alter it.\n", "\n", "Links:\n", "- [Specifying a schema](https://cloud.google.com/bigquery/docs/schemas)" ] }, { "cell_type": "code", "metadata": { "id": "gFE6sVl4kT3H" }, "source": [ "run('bq show \\\n", "--schema \\\n", "--format=prettyjson \\\n", "ardent-cycling-243415:ztf_alerts.alerts > alerts_schema.json')" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "tA_M2nyBh6B0" }, "source": [ "- remove everything after the candidate record\n", "- un-nest the candidate record and get rid of the main `RECORD` entry\n", "- remove duplicate candid (save description)\n", "- add `prv_candidates_candids` ~repeated field~ (let's not complicate things, just use a single string) field with type string\n", "- save as `DIASource_schema.json`" ] }, { "cell_type": "code", "metadata": { "id": "43U4oaBybzMQ" }, "source": [ "run('bq mk --table ardent-cycling-243415:ztf_alerts.DIASource DIASource_schema.json')" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "y5HaV0hTbySB" }, "source": [ "" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "zs1YHGgIXjry" }, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": { "id": "Ocb_ki6HTUc1" }, "source": [ "## Make BigQuery dataset `ztf_alerts` publicly queryable\n", "\n", "- go to the dataset in the Console\n", "- click \"Share Dataset\"\n", "- \"allUsers\" (should I instead used \"allAuthenticatedUsers\"? I don't understand why allUsers is an option.. you have to have authenticate your credentials to create a (python) client)\n", " - \"BigQuery Data Viewer\"\n", " - \"BigQuery Metadata Viewer\"\n", "\n", "Links to more info:\n", "- [Sharing a dataset with the public](https://cloud.google.com/bigquery/public-data#sharing_a_dataset_with_the_public)" ] }, { "cell_type": "markdown", "metadata": { "id": "AFh1PbLjtTjD" }, "source": [ "## Making Cloud Storage bucket `ztf_alert_avros` public\n", "\n", "- go to the bucket in the Console\n", "- select \"Uniform\" access control\n", "- \"allUsers\"\n", " - \"Cloud Storage\" -> \"Storage Object Viewer\"\n", " - \"Cloud Storage Legacy\" -> \"Storage Legacy Bucket Reader\"\n", "\n", "Links to more info:\n", "- [Making data public](https://cloud.google.com/storage/docs/access-control/making-data-public)\n", "- [Understanding roles](https://cloud.google.com/iam/docs/understanding-roles)" ] }, { "cell_type": "markdown", "metadata": { "id": "A9jZSylSnhNH" }, "source": [ "## Make bucket `workshop_beam_test`" ] }, { "cell_type": "markdown", "metadata": { "id": "GAa9BY-jowAT" }, "source": [ "On the Console, create a bucket called `ardent-cycling-243415-workshop_beam_test`\n", "\n", "Under permissions:\n", "- \"allUsers\"\n", " - \"Cloud Storage\" -> \"Storage Object Creator\"\n", " - \"Cloud Storage\" -> \"Storage Object Viewer\"\n", " - \"Cloud Storage Legacy\" -> \"Storage Legacy Bucket Reader\"\n", " - \"Cloud Storage Legacy\" -> \"Storage Legacy Bucket Writer\"\n", "\n", "Under Lifecycle:\n", "- set a rule to delete objects when they reach an age of 1 day." ] }, { "cell_type": "code", "metadata": { "id": "Ytsw88Ohnugw" }, "source": [ "from google.cloud import storage\n", "from google.colab import auth" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "GHxn9o-3oHh5" }, "source": [ "auth.authenticate_user()" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "jekPSGDnns_s" }, "source": [ "project_id = 'ardent-cycling-243415'\n", "bucket_name = f'{project_id}-workshop_beam_test'\n", "# bucket = storage.Bucket(bucket_name)\n", "storage_client = storage.Client(project_id)\n", "storage_client.create_bucket(bucket_name)\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "LtxuahntoWg9" }, "source": [ "ardent-cycling-243415-workshop_beam_test" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "ys94OS5TXpEd" }, "source": [ "# Test v0.3.1\n", "\n", "- [Broker testing instance instructions](https://github.com/mwvgroup/Pitt-Google-Broker/blob/master/broker/README.md)" ] }, { "cell_type": "markdown", "metadata": { "id": "xZLQLM4Pc6tS" }, "source": [ "## Setup testing instance of broker\n", "\n", "Run from command-line on my local machine:" ] }, { "cell_type": "markdown", "metadata": { "id": "TyyRk_5obq9w" }, "source": [ "Create a testing instance of the broker\n", "\n", "```bash\n", "git clone https://github.com/mwvgroup/Pitt-Google-Broker\n", "cd Pitt-Google-Broker\n", "git checkout v/tjr/0.3.1\n", "cd broker/setup_broker\n", "\n", "testid=\"v031\"\n", "./setup_broker.sh $testid\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "y3IZpp_TcFWG" }, "source": [ "Stop the VMs so we can start the night\n", "\n", "```bash\n", "consumerVM=\"ztf-consumer-${testid}\"\n", "nconductVM=\"night-conductor-${testid}\"\n", "zone=us-central1-a\n", "gcloud compute instances stop \"$consumerVM\" \"$nconductVM\" --zone=\"$zone\"\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "_OuE_75BcZl2" }, "source": [ "Start the broker without the consumer\n", "\n", "```bash\n", "instancename=\"night-conductor-${testid}\"\n", "zone=us-central1-a\n", "NIGHT=\"START\"\n", "KAFKA_TOPIC=\"NONE\" # tell night-conductor to skip booting up consumer VM\n", "gcloud compute instances add-metadata \"$instancename\" --zone=\"$zone\" \\\n", " --metadata NIGHT=\"$NIGHT\",KAFKA_TOPIC=\"$KAFKA_TOPIC\"\n", "# night-conductor will get the testid by parsing its own instance name\n", "\n", "gcloud compute instances start \"$instancename\" --zone \"$zone\"\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "8W_ui02IdyqT" }, "source": [ "## Test the broker using the Consumer Simulator\n", "\n", "(Python)" ] }, { "cell_type": "code", "metadata": { "id": "0BN4YoT9XwTk" }, "source": [ "import sys\n", "path_to_dev_utils = '/Users/troyraen/Documents/PGB/repo2/dev_utils'\n", "sys.path.append(path_to_dev_utils)\n", "\n", "from consumer_sims import ztf_consumer_sim as zcs" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "e34cW8-6kQXk" }, "source": [ "testid = 'v031'\n", "alertRate = (60, 'perMin')\n", "runTime = (1, 'hr')" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "-YNQCgDwa4eF" }, "source": [ "zcs.publish_stream(testid, alertRate, runTime=runTime, sub_id='ztf_alert_data-reservoir')" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "L_OVLPsouDjx" }, "source": [ "" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "cjcWsFEQeMaE" }, "source": [ "## Stop the testing instance of the broker\n", "\n", "- trigger `night-conductor` to end the night" ] }, { "cell_type": "markdown", "metadata": { "id": "kvBXlHrzebeu" }, "source": [ "```bash\n", "testid=v031\n", "instancename=\"night-conductor-${testid}\"\n", "zone=us-central1-a\n", "NIGHT=END\n", "gcloud compute instances add-metadata \"$instancename\" --zone=\"$zone\" \\\n", " --metadata NIGHT=\"$NIGHT\"\n", "gcloud compute instances start \"$instancename\" --zone \"$zone\"\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "xp_7CeVjeuZg" }, "source": [ "## Delete the testing instance of the broker" ] }, { "cell_type": "markdown", "metadata": { "id": "VP23jOVkew9Z" }, "source": [ "```bash\n", "testid=\"v031\"\n", "teardown=\"True\"\n", "cd Pitt-Google-Broker/broker/setup_broker\n", "./setup_broker.sh $testid $teardown\n", "```" ] }, { "cell_type": "markdown", "metadata": { "id": "C4BSEFb2XuN-" }, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": { "id": "-Oi_zEG_g3xT" }, "source": [ "# Consumer simulator" ] }, { "cell_type": "code", "metadata": { "id": "R377iEFsg_F_" }, "source": [ "" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "K4anZ91Ng-9d" }, "source": [ "" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "kesWShdNg8yi" }, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": { "id": "9UOxkwGByHza" }, "source": [ "# Dump here before trashing" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 191 }, "id": "Zp-0WvyTYE38", "outputId": "7dccc80a-f8b4-481c-a9e6-8725f7db80dc" }, "source": [ "columns = pgb.bigquery.get_history_column_names()\n", "objectIds = ['ZTF18aczuwfe']\n", "sql = pgb.bigquery.object_history_sql_statement(columns=columns, objectIds=objectIds)\n", "# using this as a template. remove the GROUPBY and ARRAY_AGG\n", "sql" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" }, "text/plain": [ "'SELECT objectId, ARRAY_AGG(publisher ORDER BY candidate.jd) AS publisher, ARRAY_AGG(candid ORDER BY candidate.jd) AS candid, ARRAY_AGG(schemavsn ORDER BY candidate.jd) AS schemavsn, ARRAY_AGG(candidate.programpi ORDER BY candidate.jd) AS programpi, ARRAY_AGG(candidate.exptime ORDER BY candidate.jd) AS exptime, ARRAY_AGG(candidate.bimage ORDER BY candidate.jd) AS bimage, ARRAY_AGG(candidate.aimage ORDER BY candidate.jd) AS aimage, ARRAY_AGG(candidate.pid ORDER BY candidate.jd) AS pid, ARRAY_AGG(candidate.sigmapsf ORDER BY candidate.jd) AS sigmapsf, ARRAY_AGG(candidate.scorr ORDER BY candidate.jd) AS scorr, ARRAY_AGG(candidate.nmtchps ORDER BY candidate.jd) AS nmtchps, ARRAY_AGG(candidate.rbversion ORDER BY candidate.jd) AS rbversion, ARRAY_AGG(candidate.magdiff ORDER BY candidate.jd) AS magdiff, ARRAY_AGG(candidate.sgmag3 ORDER BY candidate.jd) AS sgmag3, ARRAY_AGG(candidate.diffmaglim ORDER BY candidate.jd) AS diffmaglim, ARRAY_AGG(candidate.sumrat ORDER BY candidate.jd) AS sumrat, ARRAY_AGG(candidate.magpsf ORDER BY candidate.jd) AS magpsf, ARRAY_AGG(candidate.clrmed ORDER BY candidate.jd) AS clrmed, ARRAY_AGG(candidate.bimagerat ORDER BY candidate.jd) AS bimagerat, ARRAY_AGG(candidate.ndethist ORDER BY candidate.jd) AS ndethist, ARRAY_AGG(candidate.sgscore3 ORDER BY candidate.jd) AS sgscore3, ARRAY_AGG(candidate.nframesref ORDER BY candidate.jd) AS nframesref, ARRAY_AGG(candidate.nneg ORDER BY candidate.jd) AS nneg, ARRAY_AGG(candidate.ypos ORDER BY candidate.jd) AS ypos, ARRAY_AGG(candidate.ssmagnr ORDER BY candidate.jd) AS ssmagnr, ARRAY_AGG(candidate.nbad ORDER BY candidate.jd) AS nbad, ARRAY_AGG(candidate.ncovhist ORDER BY candidate.jd) AS ncovhist, ARRAY_AGG(candidate.classtar ORDER BY candidate.jd) AS classtar, ARRAY_AGG(candidate.szmag3 ORDER BY candidate.jd) AS szmag3, ARRAY_AGG(candidate.simag1 ORDER BY candidate.jd) AS simag1, ARRAY_AGG(candidate.magzpsciunc ORDER BY candidate.jd) AS magzpsciunc, ARRAY_AGG(candidate.sigmagap ORDER BY candidate.jd) AS sigmagap, ARRAY_AGG(candidate.nmatches ORDER BY candidate.jd) AS nmatches, ARRAY_AGG(candidate.fid ORDER BY candidate.jd) AS fid, ARRAY_AGG(candidate.field ORDER BY candidate.jd) AS field, ARRAY_AGG(candidate.jdstarthist ORDER BY candidate.jd) AS jdstarthist, ARRAY_AGG(candidate.rfid ORDER BY candidate.jd) AS rfid, ARRAY_AGG(candidate.magzpscirms ORDER BY candidate.jd) AS magzpscirms, ARRAY_AGG(candidate.neargaiabright ORDER BY candidate.jd) AS neargaiabright, ARRAY_AGG(candidate.neargaia ORDER BY candidate.jd) AS neargaia, ARRAY_AGG(candidate.jdstartref ORDER BY candidate.jd) AS jdstartref, ARRAY_AGG(candidate.sgscore2 ORDER BY candidate.jd) AS sgscore2, ARRAY_AGG(candidate.srmag2 ORDER BY candidate.jd) AS srmag2, ARRAY_AGG(candidate.zpmed ORDER BY candidate.jd) AS zpmed, ARRAY_AGG(candidate.maggaiabright ORDER BY candidate.jd) AS maggaiabright, ARRAY_AGG(candidate.srmag3 ORDER BY candidate.jd) AS srmag3, ARRAY_AGG(candidate.magzpsci ORDER BY candidate.jd) AS magzpsci, ARRAY_AGG(candidate.xpos ORDER BY candidate.jd) AS xpos, ARRAY_AGG(candidate.isdiffpos ORDER BY candidate.jd) AS isdiffpos, ARRAY_AGG(candidate.decnr ORDER BY candidate.jd) AS decnr, ARRAY_AGG(candidate.dsdiff ORDER BY candidate.jd) AS dsdiff, ARRAY_AGG(candidate.drb ORDER BY candidate.jd) AS drb, ARRAY_AGG(candidate.jd ORDER BY candidate.jd) AS jd, ARRAY_AGG(candidate.szmag2 ORDER BY candidate.jd) AS szmag2, ARRAY_AGG(candidate.sharpnr ORDER BY candidate.jd) AS sharpnr, ARRAY_AGG(candidate.szmag1 ORDER BY candidate.jd) AS szmag1, ARRAY_AGG(candidate.sky ORDER BY candidate.jd) AS sky, ARRAY_AGG(candidate.simag2 ORDER BY candidate.jd) AS simag2, ARRAY_AGG(candidate.clrrms ORDER BY candidate.jd) AS clrrms, ARRAY_AGG(candidate.objectidps2 ORDER BY candidate.jd) AS objectidps2, ARRAY_AGG(candidate.sgmag1 ORDER BY candidate.jd) AS sgmag1, ARRAY_AGG(candidate.magap ORDER BY candidate.jd) AS magap, ARRAY_AGG(candidate.objectidps1 ORDER BY candidate.jd) AS objectidps1, ARRAY_AGG(candidate.ssdistnr ORDER BY candidate.jd) AS ssdistnr, ARRAY_AGG(candidate.seeratio ORDER BY candidate.jd) AS seeratio, ARRAY_AGG(candidate.distpsnr3 ORDER BY candidate.jd) AS distpsnr3, ARRAY_AGG(candidate.ra ORDER BY candidate.jd) AS ra, ARRAY_AGG(candidate.chipsf ORDER BY candidate.jd) AS chipsf, ARRAY_AGG(candidate.dsnrms ORDER BY candidate.jd) AS dsnrms, ARRAY_AGG(candidate.simag3 ORDER BY candidate.jd) AS simag3, ARRAY_AGG(candidate.zpclrcov ORDER BY candidate.jd) AS zpclrcov, ARRAY_AGG(candidate.fwhm ORDER BY candidate.jd) AS fwhm, ARRAY_AGG(candidate.ranr ORDER BY candidate.jd) AS ranr, ARRAY_AGG(candidate.mindtoedge ORDER BY candidate.jd) AS mindtoedge, ARRAY_AGG(candidate.tblid ORDER BY candidate.jd) AS tblid, ARRAY_AGG(candidate.chinr ORDER BY candidate.jd) AS chinr, ARRAY_AGG(candidate.objectidps3 ORDER BY candidate.jd) AS objectidps3, ARRAY_AGG(candidate.maggaia ORDER BY candidate.jd) AS maggaia, ARRAY_AGG(candidate.distpsnr1 ORDER BY candidate.jd) AS distpsnr1, ARRAY_AGG(candidate.pdiffimfilename ORDER BY candidate.jd) AS pdiffimfilename, ARRAY_AGG(candidate.programid ORDER BY candidate.jd) AS programid, ARRAY_AGG(candidate.rcid ORDER BY candidate.jd) AS rcid, ARRAY_AGG(candidate.clrcounc ORDER BY candidate.jd) AS clrcounc, ARRAY_AGG(candidate.distnr ORDER BY candidate.jd) AS distnr, ARRAY_AGG(candidate.magnr ORDER BY candidate.jd) AS magnr, ARRAY_AGG(candidate.dec ORDER BY candidate.jd) AS dec, ARRAY_AGG(candidate.jdendhist ORDER BY candidate.jd) AS jdendhist, ARRAY_AGG(candidate.sigmagapbig ORDER BY candidate.jd) AS sigmagapbig, ARRAY_AGG(candidate.distpsnr2 ORDER BY candidate.jd) AS distpsnr2, ARRAY_AGG(candidate.ssnamenr ORDER BY candidate.jd) AS ssnamenr, ARRAY_AGG(candidate.magapbig ORDER BY candidate.jd) AS magapbig, ARRAY_AGG(candidate.elong ORDER BY candidate.jd) AS elong, ARRAY_AGG(candidate.jdendref ORDER BY candidate.jd) AS jdendref, ARRAY_AGG(candidate.sgscore1 ORDER BY candidate.jd) AS sgscore1, ARRAY_AGG(candidate.nid ORDER BY candidate.jd) AS nid, ARRAY_AGG(candidate.srmag1 ORDER BY candidate.jd) AS srmag1, ARRAY_AGG(candidate.drbversion ORDER BY candidate.jd) AS drbversion, ARRAY_AGG(candidate.aimagerat ORDER BY candidate.jd) AS aimagerat, ARRAY_AGG(candidate.magfromlim ORDER BY candidate.jd) AS magfromlim, ARRAY_AGG(candidate.clrcoeff ORDER BY candidate.jd) AS clrcoeff, ARRAY_AGG(candidate.ssnrms ORDER BY candidate.jd) AS ssnrms, ARRAY_AGG(candidate.sgmag2 ORDER BY candidate.jd) AS sgmag2, ARRAY_AGG(candidate.sigmagnr ORDER BY candidate.jd) AS sigmagnr, ARRAY_AGG(candidate.rb ORDER BY candidate.jd) AS rb, ARRAY_AGG(candidate.tooflag ORDER BY candidate.jd) AS tooflag FROM `ardent-cycling-243415.ztf_alerts.alerts` WHERE objectId IN (\"ZTF18aczuwfe\") GROUP BY objectId'" ] }, "metadata": { "tags": [] }, "execution_count": 16 } ] }, { "cell_type": "code", "metadata": { "id": "tjWRG-EVYShl" }, "source": [ "sql = 'SELECT objectId, ARRAY_AGG(publisher ORDER BY candidate.jd) AS publisher, ARRAY_AGG(candid ORDER BY candidate.jd) AS candid, ARRAY_AGG(schemavsn ORDER BY candidate.jd) AS schemavsn, ARRAY_AGG(candidate.programpi ORDER BY candidate.jd) AS programpi, ARRAY_AGG(candidate.exptime ORDER BY candidate.jd) AS exptime, ARRAY_AGG(candidate.bimage ORDER BY candidate.jd) AS bimage, ARRAY_AGG(candidate.aimage ORDER BY candidate.jd) AS aimage, ARRAY_AGG(candidate.pid ORDER BY candidate.jd) AS pid, ARRAY_AGG(candidate.sigmapsf ORDER BY candidate.jd) AS sigmapsf, ARRAY_AGG(candidate.scorr ORDER BY candidate.jd) AS scorr, ARRAY_AGG(candidate.nmtchps ORDER BY candidate.jd) AS nmtchps, ARRAY_AGG(candidate.rbversion ORDER BY candidate.jd) AS rbversion, ARRAY_AGG(candidate.magdiff ORDER BY candidate.jd) AS magdiff, ARRAY_AGG(candidate.sgmag3 ORDER BY candidate.jd) AS sgmag3, ARRAY_AGG(candidate.diffmaglim ORDER BY candidate.jd) AS diffmaglim, ARRAY_AGG(candidate.sumrat ORDER BY candidate.jd) AS sumrat, ARRAY_AGG(candidate.magpsf ORDER BY candidate.jd) AS magpsf, ARRAY_AGG(candidate.clrmed ORDER BY candidate.jd) AS clrmed, ARRAY_AGG(candidate.bimagerat ORDER BY candidate.jd) AS bimagerat, ARRAY_AGG(candidate.ndethist ORDER BY candidate.jd) AS ndethist, ARRAY_AGG(candidate.sgscore3 ORDER BY candidate.jd) AS sgscore3, ARRAY_AGG(candidate.nframesref ORDER BY candidate.jd) AS nframesref, ARRAY_AGG(candidate.nneg ORDER BY candidate.jd) AS nneg, ARRAY_AGG(candidate.ypos ORDER BY candidate.jd) AS ypos, ARRAY_AGG(candidate.ssmagnr ORDER BY candidate.jd) AS ssmagnr, ARRAY_AGG(candidate.nbad ORDER BY candidate.jd) AS nbad, ARRAY_AGG(candidate.ncovhist ORDER BY candidate.jd) AS ncovhist, ARRAY_AGG(candidate.classtar ORDER BY candidate.jd) AS classtar, ARRAY_AGG(candidate.szmag3 ORDER BY candidate.jd) AS szmag3, ARRAY_AGG(candidate.simag1 ORDER BY candidate.jd) AS simag1, ARRAY_AGG(candidate.magzpsciunc ORDER BY candidate.jd) AS magzpsciunc, ARRAY_AGG(candidate.sigmagap ORDER BY candidate.jd) AS sigmagap, ARRAY_AGG(candidate.nmatches ORDER BY candidate.jd) AS nmatches, ARRAY_AGG(candidate.fid ORDER BY candidate.jd) AS fid, ARRAY_AGG(candidate.field ORDER BY candidate.jd) AS field, ARRAY_AGG(candidate.jdstarthist ORDER BY candidate.jd) AS jdstarthist, ARRAY_AGG(candidate.rfid ORDER BY candidate.jd) AS rfid, ARRAY_AGG(candidate.magzpscirms ORDER BY candidate.jd) AS magzpscirms, ARRAY_AGG(candidate.neargaiabright ORDER BY candidate.jd) AS neargaiabright, ARRAY_AGG(candidate.neargaia ORDER BY candidate.jd) AS neargaia, ARRAY_AGG(candidate.jdstartref ORDER BY candidate.jd) AS jdstartref, ARRAY_AGG(candidate.sgscore2 ORDER BY candidate.jd) AS sgscore2, ARRAY_AGG(candidate.srmag2 ORDER BY candidate.jd) AS srmag2, ARRAY_AGG(candidate.zpmed ORDER BY candidate.jd) AS zpmed, ARRAY_AGG(candidate.maggaiabright ORDER BY candidate.jd) AS maggaiabright, ARRAY_AGG(candidate.srmag3 ORDER BY candidate.jd) AS srmag3, ARRAY_AGG(candidate.magzpsci ORDER BY candidate.jd) AS magzpsci, ARRAY_AGG(candidate.xpos ORDER BY candidate.jd) AS xpos, ARRAY_AGG(candidate.isdiffpos ORDER BY candidate.jd) AS isdiffpos, ARRAY_AGG(candidate.decnr ORDER BY candidate.jd) AS decnr, ARRAY_AGG(candidate.dsdiff ORDER BY candidate.jd) AS dsdiff, ARRAY_AGG(candidate.drb ORDER BY candidate.jd) AS drb, ARRAY_AGG(candidate.jd ORDER BY candidate.jd) AS jd, ARRAY_AGG(candidate.szmag2 ORDER BY candidate.jd) AS szmag2, ARRAY_AGG(candidate.sharpnr ORDER BY candidate.jd) AS sharpnr, ARRAY_AGG(candidate.szmag1 ORDER BY candidate.jd) AS szmag1, ARRAY_AGG(candidate.sky ORDER BY candidate.jd) AS sky, ARRAY_AGG(candidate.simag2 ORDER BY candidate.jd) AS simag2, ARRAY_AGG(candidate.clrrms ORDER BY candidate.jd) AS clrrms, ARRAY_AGG(candidate.objectidps2 ORDER BY candidate.jd) AS objectidps2, ARRAY_AGG(candidate.sgmag1 ORDER BY candidate.jd) AS sgmag1, ARRAY_AGG(candidate.magap ORDER BY candidate.jd) AS magap, ARRAY_AGG(candidate.objectidps1 ORDER BY candidate.jd) AS objectidps1, ARRAY_AGG(candidate.ssdistnr ORDER BY candidate.jd) AS ssdistnr, ARRAY_AGG(candidate.seeratio ORDER BY candidate.jd) AS seeratio, ARRAY_AGG(candidate.distpsnr3 ORDER BY candidate.jd) AS distpsnr3, ARRAY_AGG(candidate.ra ORDER BY candidate.jd) AS ra, ARRAY_AGG(candidate.chipsf ORDER BY candidate.jd) AS chipsf, ARRAY_AGG(candidate.dsnrms ORDER BY candidate.jd) AS dsnrms, ARRAY_AGG(candidate.simag3 ORDER BY candidate.jd) AS simag3, ARRAY_AGG(candidate.zpclrcov ORDER BY candidate.jd) AS zpclrcov, ARRAY_AGG(candidate.fwhm ORDER BY candidate.jd) AS fwhm, ARRAY_AGG(candidate.ranr ORDER BY candidate.jd) AS ranr, ARRAY_AGG(candidate.mindtoedge ORDER BY candidate.jd) AS mindtoedge, ARRAY_AGG(candidate.tblid ORDER BY candidate.jd) AS tblid, ARRAY_AGG(candidate.chinr ORDER BY candidate.jd) AS chinr, ARRAY_AGG(candidate.objectidps3 ORDER BY candidate.jd) AS objectidps3, ARRAY_AGG(candidate.maggaia ORDER BY candidate.jd) AS maggaia, ARRAY_AGG(candidate.distpsnr1 ORDER BY candidate.jd) AS distpsnr1, ARRAY_AGG(candidate.pdiffimfilename ORDER BY candidate.jd) AS pdiffimfilename, ARRAY_AGG(candidate.programid ORDER BY candidate.jd) AS programid, ARRAY_AGG(candidate.rcid ORDER BY candidate.jd) AS rcid, ARRAY_AGG(candidate.clrcounc ORDER BY candidate.jd) AS clrcounc, ARRAY_AGG(candidate.distnr ORDER BY candidate.jd) AS distnr, ARRAY_AGG(candidate.magnr ORDER BY candidate.jd) AS magnr, ARRAY_AGG(candidate.dec ORDER BY candidate.jd) AS dec, ARRAY_AGG(candidate.jdendhist ORDER BY candidate.jd) AS jdendhist, ARRAY_AGG(candidate.sigmagapbig ORDER BY candidate.jd) AS sigmagapbig, ARRAY_AGG(candidate.distpsnr2 ORDER BY candidate.jd) AS distpsnr2, ARRAY_AGG(candidate.ssnamenr ORDER BY candidate.jd) AS ssnamenr, ARRAY_AGG(candidate.magapbig ORDER BY candidate.jd) AS magapbig, ARRAY_AGG(candidate.elong ORDER BY candidate.jd) AS elong, ARRAY_AGG(candidate.jdendref ORDER BY candidate.jd) AS jdendref, ARRAY_AGG(candidate.sgscore1 ORDER BY candidate.jd) AS sgscore1, ARRAY_AGG(candidate.nid ORDER BY candidate.jd) AS nid, ARRAY_AGG(candidate.srmag1 ORDER BY candidate.jd) AS srmag1, ARRAY_AGG(candidate.drbversion ORDER BY candidate.jd) AS drbversion, ARRAY_AGG(candidate.aimagerat ORDER BY candidate.jd) AS aimagerat, ARRAY_AGG(candidate.magfromlim ORDER BY candidate.jd) AS magfromlim, ARRAY_AGG(candidate.clrcoeff ORDER BY candidate.jd) AS clrcoeff, ARRAY_AGG(candidate.ssnrms ORDER BY candidate.jd) AS ssnrms, ARRAY_AGG(candidate.sgmag2 ORDER BY candidate.jd) AS sgmag2, ARRAY_AGG(candidate.sigmagnr ORDER BY candidate.jd) AS sigmagnr, ARRAY_AGG(candidate.rb ORDER BY candidate.jd) AS rb, ARRAY_AGG(candidate.tooflag ORDER BY candidate.jd) AS tooflag FROM `ardent-cycling-243415.ztf_alerts.alerts` WHERE objectId IN (\"ZTF18aczuwfe\")'" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "Vm0mJjKyRuoh" }, "source": [ "client = bigquery.Client(project=project)\n", "\n", "new_table = f'{project}.{dataset}.history'\n", "job_config = bigquery.QueryJobConfig(destination=new_table)\n", "\n", "\n", "\n", "# Start the query, passing in the extra configuration.\n", "query_job = client.query(sql, job_config=job_config) # Make an API request.\n", "query_job.result() # Wait for the job to complete.\n", "\n", "print(\"Query results loaded to the table {}\".format(table_id))" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "Mbt7hPLNbzXY" }, "source": [ "from pathlib import Path\n", "inpath = Path('ztf_v3.3.pkl')\n", "with inpath.open('rb') as infile:\n", " valid_schema = pickle.load(infile)" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "tOVvW8kYgTkd" }, "source": [ "import json\n", "outpath = Path('history_schema.json')\n", "with outpath.open('w') as outfile:\n", " json.dump(valid_schema['fields'], outfile)" ], "execution_count": null, "outputs": [] } ] }