[
  {
    "path": ".gitignore",
    "content": "target/\ntpcds_kit.zip\ntpch_kit.zip\n*.sql.log\nderby.log\n"
  },
  {
    "path": "README.md",
    "content": "hive-testbench\n==============\n\nA testbench for experimenting with Apache Hive at any data scale.\n\nOverview\n========\n\nThe hive-testbench is a data generator and set of queries that lets you experiment with Apache Hive at scale. The testbench allows you to experience base Hive performance on large datasets, and gives an easy way to see the impact of Hive tuning parameters and advanced settings.\n\nPrerequisites\n=============\n\nYou will need:\n* Hadoop 2.2 or later cluster or Sandbox.\n* Apache Hive.\n* Between 15 minutes and 2 days to generate data (depending on the Scale Factor you choose and available hardware).\n* If you plan to generate 1TB or more of data, using Apache Hive 13+ to generate the data is STRONGLY suggested.\n\nInstall and Setup\n=================\n\nAll of these steps should be carried out on your Hadoop cluster.\n\n- Step 1: Prepare your environment.\n\n  In addition to Hadoop and Hive, before you begin ensure ```gcc``` is installed and available on your system path. If you system does not have it, install it using yum or apt-get.\n\n- Step 2: Decide which test suite(s) you want to use.\n\n  hive-testbench comes with data generators and sample queries based on both the TPC-DS and TPC-H benchmarks. You can choose to use either or both of these benchmarks for experiementation. More information about these benchmarks can be found at the Transaction Processing Council homepage.\n\n- Step 3: Compile and package the appropriate data generator.\n\n  For TPC-DS, ```./tpcds-build.sh``` downloads, compiles and packages the TPC-DS data generator.\n  For TPC-H, ```./tpch-build.sh``` downloads, compiles and packages the TPC-H data generator.\n\n- Step 4: Decide how much data you want to generate.\n\n  You need to decide on a \"Scale Factor\" which represents how much data you will generate. Scale Factor roughly translates to gigabytes, so a Scale Factor of 100 is about 100 gigabytes and one terabyte is Scale Factor 1000. Decide how much data you want and keep it in mind for the next step. If you have a cluster of 4-10 nodes or just want to experiment at a smaller scale, scale 1000 (1 TB) of data is a good starting point. If you have a large cluster, you may want to choose Scale 10000 (10 TB) or more. The notion of scale factor is similar between TPC-DS and TPC-H.\n\n  If you want to generate a large amount of data, you should use Hive 13 or later. Hive 13 introduced an optimization that allows far more scalable data partitioning. Hive 12 and lower will likely crash if you generate more than a few hundred GB of data and tuning around the problem is difficult. You can generate text or RCFile data in Hive 13 and use it in multiple versions of Hive.\n\n- Step 5: Generate and load the data.\n\n  The scripts ```tpcds-setup.sh``` and ```tpch-setup.sh``` generate and load data for TPC-DS and TPC-H, respectively. General usage is ```tpcds-setup.sh scale_factor [directory]``` or ```tpch-setup.sh scale_factor [directory]```\n\n  Some examples:\n\n  Build 1 TB of TPC-DS data: ```./tpcds-setup.sh 1000```\n\n  Build 1 TB of TPC-H data: ```./tpch-setup.sh 1000```\n\n  Build 100 TB of TPC-DS data: ```./tpcds-setup.sh 100000```\n\n  Build 30 TB of text formatted TPC-DS data: ```FORMAT=textfile ./tpcds-setup 30000```\n\n  Build 30 TB of RCFile formatted TPC-DS data: ```FORMAT=rcfile ./tpcds-setup 30000```\n  \n  Also check other parameters in setup scripts important one is BUCKET_DATA.\n\n- Step 6: Run queries.\n\n  More than 50 sample TPC-DS queries and all TPC-H queries are included for you to try. You can use ```hive```, ```beeline``` or the SQL tool of your choice. The testbench also includes a set of suggested settings.\n\n  This example assumes you have generated 1 TB of TPC-DS data during Step 5:\n\n  \t```\n  \tcd sample-queries-tpcds\n  \thive -i testbench.settings\n  \thive> use tpcds_bin_partitioned_orc_1000;\n  \thive> source query55.sql;\n  \t```\n\n  Note that the database is named based on the Data Scale chosen in step 3. At Data Scale 10000, your database will be named tpcds_bin_partitioned_orc_10000. At Data Scale 1000 it would be named tpch_flat_orc_1000. You can always ```show databases``` to get a list of available databases.\n\n  Similarly, if you generated 1 TB of TPC-H data during Step 5:\n\n  \t```\n  \tcd sample-queries-tpch\n  \thive -i testbench.settings\n  \thive> use tpch_flat_orc_1000;\n  \thive> source tpch_query1.sql;\n  \t```\n\nFeedback\n========\n\nIf you have questions, comments or problems, visit the [Hortonworks Hive forum](http://hortonworks.com/community/forums/forum/hive/).\n\nIf you have improvements, pull requests are accepted.\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/add_constraints.sql",
    "content": "-- set hivevar:DB=tpcds_bin_partitioned_orc_10000\n\nalter table customer_address add constraint ${DB}_pk_ca primary key (ca_address_sk) disable novalidate rely;\nalter table customer_demographics add constraint ${DB}_pk_cd primary key (cd_demo_sk) disable novalidate rely;\nalter table date_dim add constraint ${DB}_pk_dd primary key (d_date_sk) disable novalidate rely;\nalter table warehouse add constraint ${DB}_pk_w primary key (w_warehouse_sk) disable novalidate rely;\nalter table ship_mode add constraint ${DB}_pk_sm primary key (sm_ship_mode_sk) disable novalidate rely;\nalter table time_dim add constraint ${DB}_pk_td primary key (t_time_sk) disable novalidate rely;\nalter table reason add constraint ${DB}_pk_r primary key (r_reason_sk) disable novalidate rely;\nalter table income_band add constraint ${DB}_pk_ib primary key (ib_income_band_sk) disable novalidate rely;\nalter table item add constraint ${DB}_pk_i primary key (i_item_sk) disable novalidate rely;\nalter table store add constraint ${DB}_pk_s primary key (s_store_sk) disable novalidate rely;\nalter table call_center add constraint ${DB}_pk_cc primary key (cc_call_center_sk) disable novalidate rely;\nalter table customer add constraint ${DB}_pk_c primary key (c_customer_sk) disable novalidate rely;\nalter table web_site add constraint ${DB}_pk_ws primary key (web_site_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_pk_sr primary key (sr_item_sk, sr_ticket_number) disable novalidate rely;\nalter table household_demographics add constraint ${DB}_pk_hd primary key (hd_demo_sk) disable novalidate rely;\nalter table web_page add constraint ${DB}_pk_wp primary key (wp_web_page_sk) disable novalidate rely;\nalter table promotion add constraint ${DB}_pk_p primary key (p_promo_sk) disable novalidate rely;\nalter table catalog_page add constraint ${DB}_pk_cp primary key (cp_catalog_page_sk) disable novalidate rely;\n-- partition_col case\nalter table inventory add constraint ${DB}_pk_in primary key (inv_date_sk, inv_item_sk, inv_warehouse_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_pk_cr primary key (cr_item_sk, cr_order_number) disable novalidate rely;\nalter table web_returns add constraint ${DB}_pk_wr primary key (wr_item_sk, wr_order_number) disable novalidate rely;\nalter table web_sales add constraint ${DB}_pk_ws2 primary key (ws_item_sk, ws_order_number) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_pk_cs primary key (cs_item_sk, cs_order_number) disable novalidate rely;\nalter table store_sales add constraint ${DB}_pk_ss primary key (ss_item_sk, ss_ticket_number) disable novalidate rely;\n\nalter table call_center add constraint ${DB}_cc_d1 foreign key  (cc_closed_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table call_center add constraint ${DB}_cc_d2 foreign key  (cc_open_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_page add constraint ${DB}_cp_d1 foreign key  (cp_end_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_page add constraint ${DB}_cp_d2 foreign key  (cp_start_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_cc foreign key  (cr_call_center_sk) references call_center (cc_call_center_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_cp foreign key  (cr_catalog_page_sk) references catalog_page (cp_catalog_page_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_cs foreign key  (cr_item_sk, cr_order_number) references catalog_sales (cs_item_sk, cs_order_number) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_i foreign key  (cr_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_r foreign key  (cr_reason_sk) references reason (r_reason_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_a1 foreign key  (cr_refunded_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_cd1 foreign key  (cr_refunded_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_c1 foreign key  (cr_refunded_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_hd1 foreign key  (cr_refunded_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\n-- partition_col case\nalter table catalog_returns add constraint ${DB}_cr_d1 foreign key  (cr_returned_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_t foreign key  (cr_returned_time_sk) references time_dim (t_time_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_a2 foreign key  (cr_returning_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_cd2 foreign key  (cr_returning_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_c2 foreign key  (cr_returning_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_hd2 foreign key  (cr_returning_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\n-- alter table catalog_returns add constraint ${DB}_cr_d2 foreign key  (cr_ship_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_sm foreign key  (cr_ship_mode_sk) references ship_mode (sm_ship_mode_sk) disable novalidate rely;\nalter table catalog_returns add constraint ${DB}_cr_w2 foreign key  (cr_warehouse_sk) references warehouse (w_warehouse_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_b_a foreign key  (cs_bill_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_b_cd foreign key  (cs_bill_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_b_c foreign key  (cs_bill_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_b_hd foreign key  (cs_bill_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_cc foreign key  (cs_call_center_sk) references call_center (cc_call_center_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_cp foreign key  (cs_catalog_page_sk) references catalog_page (cp_catalog_page_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_i foreign key  (cs_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_p foreign key  (cs_promo_sk) references promotion (p_promo_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_s_a foreign key  (cs_ship_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_s_cd foreign key  (cs_ship_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_s_c foreign key  (cs_ship_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_d1 foreign key  (cs_ship_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_s_hd foreign key  (cs_ship_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_sm foreign key  (cs_ship_mode_sk) references ship_mode (sm_ship_mode_sk) disable novalidate rely;\n-- partition_col case\nalter table catalog_sales add constraint ${DB}_cs_d2 foreign key  (cs_sold_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_t foreign key  (cs_sold_time_sk) references time_dim (t_time_sk) disable novalidate rely;\nalter table catalog_sales add constraint ${DB}_cs_w foreign key  (cs_warehouse_sk) references warehouse (w_warehouse_sk) disable novalidate rely;\nalter table customer add constraint ${DB}_c_a foreign key  (c_current_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table customer add constraint ${DB}_c_cd foreign key  (c_current_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table customer add constraint ${DB}_c_hd foreign key  (c_current_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table customer add constraint ${DB}_c_fsd foreign key  (c_first_sales_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table customer add constraint ${DB}_c_fsd2 foreign key  (c_first_shipto_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table household_demographics add constraint ${DB}_hd_ib foreign key  (hd_income_band_sk) references income_band (ib_income_band_sk) disable novalidate rely;\n-- partition_col case\nalter table inventory add constraint ${DB}_inv_d foreign key  (inv_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table inventory add constraint ${DB}_inv_i foreign key  (inv_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table inventory add constraint ${DB}_inv_w foreign key  (inv_warehouse_sk) references warehouse (w_warehouse_sk) disable novalidate rely;\nalter table promotion add constraint ${DB}_p_end_date foreign key  (p_end_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table promotion add constraint ${DB}_p_i foreign key  (p_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table promotion add constraint ${DB}_p_start_date foreign key  (p_start_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table store add constraint ${DB}_s_close_date foreign key  (s_closed_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_a foreign key  (sr_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_cd foreign key  (sr_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_c foreign key  (sr_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_hd foreign key  (sr_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_i foreign key  (sr_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_r foreign key  (sr_reason_sk) references reason (r_reason_sk) disable novalidate rely;\n-- partition_col case\nalter table store_returns add constraint ${DB}_sr_ret_d foreign key  (sr_returned_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_t foreign key  (sr_return_time_sk) references time_dim (t_time_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_s foreign key  (sr_store_sk) references store (s_store_sk) disable novalidate rely;\nalter table store_returns add constraint ${DB}_sr_ss foreign key  (sr_item_sk, sr_ticket_number) references store_sales (ss_item_sk, ss_ticket_number) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_a foreign key  (ss_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_cd foreign key  (ss_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_c foreign key  (ss_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_hd foreign key  (ss_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_i foreign key  (ss_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_p foreign key  (ss_promo_sk) references promotion (p_promo_sk) disable novalidate rely;\n-- partition_col case\nalter table store_sales add constraint ${DB}_ss_d foreign key  (ss_sold_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_t foreign key  (ss_sold_time_sk) references time_dim (t_time_sk) disable novalidate rely;\nalter table store_sales add constraint ${DB}_ss_s foreign key  (ss_store_sk) references store (s_store_sk) disable novalidate rely;\nalter table web_page add constraint ${DB}_wp_ad foreign key  (wp_access_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table web_page add constraint ${DB}_wp_cd foreign key  (wp_creation_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_i foreign key  (wr_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_r foreign key  (wr_reason_sk) references reason (r_reason_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ref_a foreign key  (wr_refunded_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ref_cd foreign key  (wr_refunded_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ref_c foreign key  (wr_refunded_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ref_hd foreign key  (wr_refunded_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\n-- partition_col case\nalter table web_returns add constraint ${DB}_wr_ret_d foreign key  (wr_returned_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ret_t foreign key  (wr_returned_time_sk) references time_dim (t_time_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ret_a foreign key  (wr_returning_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ret_cd foreign key  (wr_returning_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ret_c foreign key  (wr_returning_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ret_hd foreign key  (wr_returning_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_ws foreign key  (wr_item_sk, wr_order_number) references web_sales (ws_item_sk, ws_order_number) disable novalidate rely;\nalter table web_returns add constraint ${DB}_wr_wp foreign key  (wr_web_page_sk) references web_page (wp_web_page_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_b_a foreign key  (ws_bill_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_b_cd foreign key  (ws_bill_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_b_c foreign key  (ws_bill_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_b_hd foreign key  (ws_bill_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_i foreign key  (ws_item_sk) references item (i_item_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_p foreign key  (ws_promo_sk) references promotion (p_promo_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_s_a foreign key  (ws_ship_addr_sk) references customer_address (ca_address_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_s_cd foreign key  (ws_ship_cdemo_sk) references customer_demographics (cd_demo_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_s_c foreign key  (ws_ship_customer_sk) references customer (c_customer_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_s_d foreign key  (ws_ship_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_s_hd foreign key  (ws_ship_hdemo_sk) references household_demographics (hd_demo_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_sm foreign key  (ws_ship_mode_sk) references ship_mode (sm_ship_mode_sk) disable novalidate rely;\n-- partition_col case\nalter table web_sales add constraint ${DB}_ws_d2 foreign key  (ws_sold_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_t foreign key  (ws_sold_time_sk) references time_dim (t_time_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_w2 foreign key  (ws_warehouse_sk) references warehouse (w_warehouse_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_wp foreign key  (ws_web_page_sk) references web_page (wp_web_page_sk) disable novalidate rely;\nalter table web_sales add constraint ${DB}_ws_ws foreign key  (ws_web_site_sk) references web_site (web_site_sk) disable novalidate rely;\nalter table web_site add constraint ${DB}_web_d1 foreign key  (web_close_date_sk) references date_dim (d_date_sk) disable novalidate rely;\nalter table web_site add constraint ${DB}_web_d2 foreign key (web_open_date_sk) references date_dim (d_date_sk) disable novalidate rely;\n\nalter table store change column s_store_id s_store_id string constraint ${DB}_strid_nn not null disable novalidate rely;\nalter table call_center change column cc_call_center_id cc_call_center_id string constraint ${DB}_ccid_nn not null disable novalidate rely;\nalter table catalog_page change column cp_catalog_page_id cp_catalog_page_id string constraint ${DB}_cpid_nn not null disable novalidate rely;\nalter table web_site change column web_site_id web_site_id string constraint ${DB}_wsid_nn not null disable novalidate rely;\nalter table web_page change column wp_web_page_id wp_web_page_id string constraint ${DB}_wpid_nn not null disable novalidate rely;\nalter table warehouse change column w_warehouse_id w_warehouse_id string constraint ${DB}_wid_nn not null disable novalidate rely;\nalter table customer change column c_customer_id c_customer_id string constraint ${DB}_cid_nn not null disable novalidate rely;\nalter table customer_address change column ca_address_id ca_address_id string constraint ${DB}_caid_nn not null disable novalidate rely;\nalter table date_dim change column d_date_id d_date_id string constraint ${DB}_did_nn not null disable novalidate rely;\nalter table item change column i_item_id i_item_id string constraint ${DB}_itid_nn not null disable novalidate rely;\nalter table promotion change column p_promo_id p_promo_id string constraint ${DB}_pid_nn not null disable novalidate rely;\nalter table reason change column r_reason_id r_reason_id string constraint ${DB}_rid_nn not null disable novalidate rely;\nalter table ship_mode change column sm_ship_mode_id sm_ship_mode_id string constraint ${DB}_smid_nn not null disable novalidate rely;\nalter table time_dim change column t_time_id t_time_id string constraint ${DB}_tid_nn not null disable novalidate rely;\n\nalter table customer change column c_customer_id c_customer_id string constraint ${DB}_cid_uq unique disable novalidate rely;\n\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/analyze.sql",
    "content": "analyze table call_center compute statistics for columns;\nanalyze table catalog_page compute statistics for columns;\nanalyze table catalog_returns compute statistics for columns;\nanalyze table catalog_sales compute statistics for columns;\nanalyze table customer compute statistics for columns;\nanalyze table customer_address compute statistics for columns;\nanalyze table customer_demographics compute statistics for columns;\nanalyze table date_dim compute statistics for columns;\nanalyze table household_demographics compute statistics for columns;\nanalyze table income_band compute statistics for columns;\nanalyze table inventory compute statistics for columns;\nanalyze table item compute statistics for columns;\nanalyze table promotion compute statistics for columns;\nanalyze table reason compute statistics for columns;\nanalyze table ship_mode compute statistics for columns;\nanalyze table store compute statistics for columns;\nanalyze table store_returns compute statistics for columns;\nanalyze table store_sales compute statistics for columns;\nanalyze table time_dim compute statistics for columns;\nanalyze table warehouse compute statistics for columns;\nanalyze table web_page compute statistics for columns;\nanalyze table web_returns compute statistics for columns;\nanalyze table web_sales compute statistics for columns;\nanalyze table web_site compute statistics for columns;"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/call_center.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists call_center;\n\ncreate table call_center\nstored as ${FILE}\nas select * from ${SOURCE}.call_center;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/catalog_page.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists catalog_page;\n\ncreate table catalog_page\nstored as ${FILE}\nas select * from ${SOURCE}.catalog_page;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/catalog_returns.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists catalog_returns;\n\ncreate table catalog_returns\n(\n      cr_returned_time_sk bigint\n,     cr_item_sk bigint\n,     cr_refunded_customer_sk bigint\n,     cr_refunded_cdemo_sk bigint\n,     cr_refunded_hdemo_sk bigint\n,     cr_refunded_addr_sk bigint\n,     cr_returning_customer_sk bigint\n,     cr_returning_cdemo_sk bigint\n,     cr_returning_hdemo_sk bigint\n,     cr_returning_addr_sk bigint\n,     cr_call_center_sk bigint\n,     cr_catalog_page_sk bigint\n,     cr_ship_mode_sk bigint\n,     cr_warehouse_sk bigint\n,     cr_reason_sk bigint\n,     cr_order_number bigint\n,     cr_return_quantity int\n,     cr_return_amount decimal(7,2)\n,     cr_return_tax decimal(7,2)\n,     cr_return_amt_inc_tax decimal(7,2)\n,     cr_fee decimal(7,2)\n,     cr_return_ship_cost decimal(7,2)\n,     cr_refunded_cash decimal(7,2)\n,     cr_reversed_charge decimal(7,2)\n,     cr_store_credit decimal(7,2)\n,     cr_net_loss decimal(7,2)\n)\npartitioned by (cr_returned_date_sk bigint)\nstored as ${FILE};\n\nfrom ${SOURCE}.catalog_returns cr\ninsert overwrite table catalog_returns partition(cr_returned_date_sk) \nselect\n        cr.cr_returned_time_sk,\n        cr.cr_item_sk,\n        cr.cr_refunded_customer_sk,\n        cr.cr_refunded_cdemo_sk,\n        cr.cr_refunded_hdemo_sk,\n        cr.cr_refunded_addr_sk,\n        cr.cr_returning_customer_sk,\n        cr.cr_returning_cdemo_sk,\n        cr.cr_returning_hdemo_sk,\n        cr.cr_returning_addr_sk,\n        cr.cr_call_center_sk,\n        cr.cr_catalog_page_sk,\n        cr.cr_ship_mode_sk,\n        cr.cr_warehouse_sk,\n        cr.cr_reason_sk,\n        cr.cr_order_number,\n        cr.cr_return_quantity,\n        cr.cr_return_amount,\n        cr.cr_return_tax,\n        cr.cr_return_amt_inc_tax,\n        cr.cr_fee,\n        cr.cr_return_ship_cost,\n        cr.cr_refunded_cash,\n        cr.cr_reversed_charge,\n        cr.cr_store_credit,\n        cr.cr_net_loss,\n        cr.cr_returned_date_sk\n      where cr.cr_returned_date_sk is not null\ninsert overwrite table catalog_returns partition (cr_returned_date_sk) \nselect\n        cr.cr_returned_time_sk,\n        cr.cr_item_sk,\n        cr.cr_refunded_customer_sk,\n        cr.cr_refunded_cdemo_sk,\n        cr.cr_refunded_hdemo_sk,\n        cr.cr_refunded_addr_sk,\n        cr.cr_returning_customer_sk,\n        cr.cr_returning_cdemo_sk,\n        cr.cr_returning_hdemo_sk,\n        cr.cr_returning_addr_sk,\n        cr.cr_call_center_sk,\n        cr.cr_catalog_page_sk,\n        cr.cr_ship_mode_sk,\n        cr.cr_warehouse_sk,\n        cr.cr_reason_sk,\n        cr.cr_order_number,\n        cr.cr_return_quantity,\n        cr.cr_return_amount,\n        cr.cr_return_tax,\n        cr.cr_return_amt_inc_tax,\n        cr.cr_fee,\n        cr.cr_return_ship_cost,\n        cr.cr_refunded_cash,\n        cr.cr_reversed_charge,\n        cr.cr_store_credit,\n        cr.cr_net_loss,\n        cr.cr_returned_date_sk\n      where cr.cr_returned_date_sk is null\n      sort by cr_returned_date_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/catalog_sales.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists catalog_sales;\n\ncreate table catalog_sales\n(\n      cs_sold_time_sk bigint\n,     cs_ship_date_sk bigint\n,     cs_bill_customer_sk bigint\n,     cs_bill_cdemo_sk bigint\n,     cs_bill_hdemo_sk bigint\n,     cs_bill_addr_sk bigint\n,     cs_ship_customer_sk bigint\n,     cs_ship_cdemo_sk bigint\n,     cs_ship_hdemo_sk bigint\n,     cs_ship_addr_sk bigint\n,     cs_call_center_sk bigint\n,     cs_catalog_page_sk bigint\n,     cs_ship_mode_sk bigint\n,     cs_warehouse_sk bigint\n,     cs_item_sk bigint\n,     cs_promo_sk bigint\n,     cs_order_number bigint\n,     cs_quantity int\n,     cs_wholesale_cost decimal(7,2)\n,     cs_list_price decimal(7,2)\n,     cs_sales_price decimal(7,2)\n,     cs_ext_discount_amt decimal(7,2)\n,     cs_ext_sales_price decimal(7,2)\n,     cs_ext_wholesale_cost decimal(7,2)\n,     cs_ext_list_price decimal(7,2)\n,     cs_ext_tax decimal(7,2)\n,     cs_coupon_amt decimal(7,2)\n,     cs_ext_ship_cost decimal(7,2)\n,     cs_net_paid decimal(7,2)\n,     cs_net_paid_inc_tax decimal(7,2)\n,     cs_net_paid_inc_ship decimal(7,2)\n,     cs_net_paid_inc_ship_tax decimal(7,2)\n,     cs_net_profit decimal(7,2)\n)\npartitioned by (cs_sold_date_sk bigint)\nstored as ${FILE};\n\nfrom ${SOURCE}.catalog_sales cs\ninsert overwrite table catalog_sales partition (cs_sold_date_sk) \nselect\n        cs.cs_sold_time_sk,\n        cs.cs_ship_date_sk,\n        cs.cs_bill_customer_sk,\n        cs.cs_bill_cdemo_sk,\n        cs.cs_bill_hdemo_sk,\n        cs.cs_bill_addr_sk,\n        cs.cs_ship_customer_sk,\n        cs.cs_ship_cdemo_sk,\n        cs.cs_ship_hdemo_sk,\n        cs.cs_ship_addr_sk,\n        cs.cs_call_center_sk,\n        cs.cs_catalog_page_sk,\n        cs.cs_ship_mode_sk,\n        cs.cs_warehouse_sk,\n        cs.cs_item_sk,\n        cs.cs_promo_sk,\n        cs.cs_order_number,\n        cs.cs_quantity,\n        cs.cs_wholesale_cost,\n        cs.cs_list_price,\n        cs.cs_sales_price,\n        cs.cs_ext_discount_amt,\n        cs.cs_ext_sales_price,\n        cs.cs_ext_wholesale_cost,\n        cs.cs_ext_list_price,\n        cs.cs_ext_tax,\n        cs.cs_coupon_amt,\n        cs.cs_ext_ship_cost,\n        cs.cs_net_paid,\n        cs.cs_net_paid_inc_tax,\n        cs.cs_net_paid_inc_ship,\n        cs.cs_net_paid_inc_ship_tax,\n        cs.cs_net_profit,\n        cs.cs_sold_date_sk\n        where cs.cs_sold_date_sk is not null\ninsert overwrite table catalog_sales partition (cs_sold_date_sk) \nselect\n        cs.cs_sold_time_sk,\n        cs.cs_ship_date_sk,\n        cs.cs_bill_customer_sk,\n        cs.cs_bill_cdemo_sk,\n        cs.cs_bill_hdemo_sk,\n        cs.cs_bill_addr_sk,\n        cs.cs_ship_customer_sk,\n        cs.cs_ship_cdemo_sk,\n        cs.cs_ship_hdemo_sk,\n        cs.cs_ship_addr_sk,\n        cs.cs_call_center_sk,\n        cs.cs_catalog_page_sk,\n        cs.cs_ship_mode_sk,\n        cs.cs_warehouse_sk,\n        cs.cs_item_sk,\n        cs.cs_promo_sk,\n        cs.cs_order_number,\n        cs.cs_quantity,\n        cs.cs_wholesale_cost,\n        cs.cs_list_price,\n        cs.cs_sales_price,\n        cs.cs_ext_discount_amt,\n        cs.cs_ext_sales_price,\n        cs.cs_ext_wholesale_cost,\n        cs.cs_ext_list_price,\n        cs.cs_ext_tax,\n        cs.cs_coupon_amt,\n        cs.cs_ext_ship_cost,\n        cs.cs_net_paid,\n        cs.cs_net_paid_inc_tax,\n        cs.cs_net_paid_inc_ship,\n        cs.cs_net_paid_inc_ship_tax,\n        cs.cs_net_profit,\n        cs.cs_sold_date_sk\n        where cs.cs_sold_date_sk is null\n        sort by cs.cs_sold_date_sk\n ;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/customer.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists customer;\n\ncreate table customer\nstored as ${FILE}\nas select * from ${SOURCE}.customer\nCLUSTER BY c_customer_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/customer_address.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists customer_address;\n\ncreate table customer_address\nstored as ${FILE}\nas select * from ${SOURCE}.customer_address \nCLUSTER BY ca_address_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/customer_demographics.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists customer_demographics;\n\ncreate table customer_demographics\nstored as ${FILE}\nas select * from ${SOURCE}.customer_demographics;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/date_dim.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists date_dim;\n\ncreate table date_dim\nstored as ${FILE}\nas select * from ${SOURCE}.date_dim;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/household_demographics.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists household_demographics;\n\ncreate table household_demographics\nstored as ${FILE}\nas select * from ${SOURCE}.household_demographics;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/income_band.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists income_band;\n\ncreate table income_band\nstored as ${FILE}\nas select * from ${SOURCE}.income_band;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/inventory.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists inventory;\n\ncreate table inventory\nstored as ${FILE}\nas select * from ${SOURCE}.inventory\nCLUSTER BY inv_date_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/item.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists item;\n\ncreate table item\nstored as ${FILE}\nas select * from ${SOURCE}.item\nCLUSTER BY i_item_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/promotion.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists promotion;\n\ncreate table promotion\nstored as ${FILE}\nas select * from ${SOURCE}.promotion;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/reason.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists reason;\n\ncreate table reason\nstored as ${FILE}\nas select * from ${SOURCE}.reason;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/ship_mode.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists ship_mode;\n\ncreate table ship_mode\nstored as ${FILE}\nas select * from ${SOURCE}.ship_mode;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/store.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists store;\n\ncreate table store\nstored as ${FILE}\nas select * from ${SOURCE}.store\nCLUSTER BY s_store_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/store_returns.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists store_returns;\n\ncreate table store_returns\n(\n      sr_return_time_sk bigint\n,     sr_item_sk bigint\n,     sr_customer_sk bigint\n,     sr_cdemo_sk bigint\n,     sr_hdemo_sk bigint\n,     sr_addr_sk bigint\n,     sr_store_sk bigint\n,     sr_reason_sk bigint\n,     sr_ticket_number bigint\n,     sr_return_quantity int\n,     sr_return_amt decimal(7,2)\n,     sr_return_tax decimal(7,2)\n,     sr_return_amt_inc_tax decimal(7,2)\n,     sr_fee decimal(7,2)\n,     sr_return_ship_cost decimal(7,2)\n,     sr_refunded_cash decimal(7,2)\n,     sr_reversed_charge decimal(7,2)\n,     sr_store_credit decimal(7,2)\n,     sr_net_loss decimal(7,2)\n)\npartitioned by (sr_returned_date_sk bigint)\nstored as ${FILE};\n\nfrom ${SOURCE}.store_returns sr\ninsert overwrite table store_returns partition (sr_returned_date_sk) \nselect\n        sr.sr_return_time_sk,\n        sr.sr_item_sk,\n        sr.sr_customer_sk,\n        sr.sr_cdemo_sk,\n        sr.sr_hdemo_sk,\n        sr.sr_addr_sk,\n        sr.sr_store_sk,\n        sr.sr_reason_sk,\n        sr.sr_ticket_number,\n        sr.sr_return_quantity,\n        sr.sr_return_amt,\n        sr.sr_return_tax,\n        sr.sr_return_amt_inc_tax,\n        sr.sr_fee,\n        sr.sr_return_ship_cost,\n        sr.sr_refunded_cash,\n        sr.sr_reversed_charge,\n        sr.sr_store_credit,\n        sr.sr_net_loss,\n        sr.sr_returned_date_sk\n        where sr.sr_returned_date_sk is not null\ninsert overwrite table store_returns partition (sr_returned_date_sk) \nselect\n        sr.sr_return_time_sk,\n        sr.sr_item_sk,\n        sr.sr_customer_sk,\n        sr.sr_cdemo_sk,\n        sr.sr_hdemo_sk,\n        sr.sr_addr_sk,\n        sr.sr_store_sk,\n        sr.sr_reason_sk,\n        sr.sr_ticket_number,\n        sr.sr_return_quantity,\n        sr.sr_return_amt,\n        sr.sr_return_tax,\n        sr.sr_return_amt_inc_tax,\n        sr.sr_fee,\n        sr.sr_return_ship_cost,\n        sr.sr_refunded_cash,\n        sr.sr_reversed_charge,\n        sr.sr_store_credit,\n        sr.sr_net_loss,\n        sr.sr_returned_date_sk\n        where sr.sr_returned_date_sk is null\n        sort by sr.sr_returned_date_sk\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/store_sales.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists store_sales;\n\ncreate table store_sales\n(\n      ss_sold_time_sk bigint\n,     ss_item_sk bigint\n,     ss_customer_sk bigint\n,     ss_cdemo_sk bigint\n,     ss_hdemo_sk bigint\n,     ss_addr_sk bigint\n,     ss_store_sk bigint\n,     ss_promo_sk bigint\n,     ss_ticket_number bigint\n,     ss_quantity int\n,     ss_wholesale_cost decimal(7,2)\n,     ss_list_price decimal(7,2)\n,     ss_sales_price decimal(7,2)\n,     ss_ext_discount_amt decimal(7,2)\n,     ss_ext_sales_price decimal(7,2)\n,     ss_ext_wholesale_cost decimal(7,2)\n,     ss_ext_list_price decimal(7,2)\n,     ss_ext_tax decimal(7,2)\n,     ss_coupon_amt decimal(7,2)\n,     ss_net_paid decimal(7,2)\n,     ss_net_paid_inc_tax decimal(7,2)\n,     ss_net_profit decimal(7,2)\n)\npartitioned by (ss_sold_date_sk bigint)\nstored as ${FILE};\n\nfrom ${SOURCE}.store_sales ss\ninsert overwrite table store_sales partition (ss_sold_date_sk) \nselect\n        ss.ss_sold_time_sk,\n        ss.ss_item_sk,\n        ss.ss_customer_sk,\n        ss.ss_cdemo_sk,\n        ss.ss_hdemo_sk,\n        ss.ss_addr_sk,\n        ss.ss_store_sk,\n        ss.ss_promo_sk,\n        ss.ss_ticket_number,\n        ss.ss_quantity,\n        ss.ss_wholesale_cost,\n        ss.ss_list_price,\n        ss.ss_sales_price,\n        ss.ss_ext_discount_amt,\n        ss.ss_ext_sales_price,\n        ss.ss_ext_wholesale_cost,\n        ss.ss_ext_list_price,\n        ss.ss_ext_tax,\n        ss.ss_coupon_amt,\n        ss.ss_net_paid,\n        ss.ss_net_paid_inc_tax,\n        ss.ss_net_profit,\n        ss.ss_sold_date_sk\n        where ss.ss_sold_date_sk is not null\ninsert overwrite table store_sales partition (ss_sold_date_sk) \nselect\n        ss.ss_sold_time_sk,\n        ss.ss_item_sk,\n        ss.ss_customer_sk,\n        ss.ss_cdemo_sk,\n        ss.ss_hdemo_sk,\n        ss.ss_addr_sk,\n        ss.ss_store_sk,\n        ss.ss_promo_sk,\n        ss.ss_ticket_number,\n        ss.ss_quantity,\n        ss.ss_wholesale_cost,\n        ss.ss_list_price,\n        ss.ss_sales_price,\n        ss.ss_ext_discount_amt,\n        ss.ss_ext_sales_price,\n        ss.ss_ext_wholesale_cost,\n        ss.ss_ext_list_price,\n        ss.ss_ext_tax,\n        ss.ss_coupon_amt,\n        ss.ss_net_paid,\n        ss.ss_net_paid_inc_tax,\n        ss.ss_net_profit,\n        ss.ss_sold_date_sk\n        where ss.ss_sold_date_sk is null\n        sort by ss.ss_sold_date_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/time_dim.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists time_dim;\n\ncreate table time_dim\nstored as ${FILE}\nas select * from ${SOURCE}.time_dim;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/warehouse.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists warehouse;\n\ncreate table warehouse\nstored as ${FILE}\nas select * from ${SOURCE}.warehouse;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/web_page.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists web_page;\n\ncreate table web_page\nstored as ${FILE}\nas select * from ${SOURCE}.web_page;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/web_returns.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists web_returns;\n\ncreate table web_returns\n(\n      wr_returned_time_sk bigint\n,     wr_item_sk bigint\n,     wr_refunded_customer_sk bigint\n,     wr_refunded_cdemo_sk bigint\n,     wr_refunded_hdemo_sk bigint\n,     wr_refunded_addr_sk bigint\n,     wr_returning_customer_sk bigint\n,     wr_returning_cdemo_sk bigint\n,     wr_returning_hdemo_sk bigint\n,     wr_returning_addr_sk bigint\n,     wr_web_page_sk bigint\n,     wr_reason_sk bigint\n,     wr_order_number bigint\n,     wr_return_quantity int\n,     wr_return_amt decimal(7,2)\n,     wr_return_tax decimal(7,2)\n,     wr_return_amt_inc_tax decimal(7,2)\n,     wr_fee decimal(7,2)\n,     wr_return_ship_cost decimal(7,2)\n,     wr_refunded_cash decimal(7,2)\n,     wr_reversed_charge decimal(7,2)\n,     wr_account_credit decimal(7,2)\n,     wr_net_loss decimal(7,2)\n)\npartitioned by (wr_returned_date_sk       bigint)\nstored as ${FILE};\n\nfrom ${SOURCE}.web_returns wr\ninsert overwrite table web_returns partition (wr_returned_date_sk)\nselect\n        wr.wr_returned_time_sk,\n        wr.wr_item_sk,\n        wr.wr_refunded_customer_sk,\n        wr.wr_refunded_cdemo_sk,\n        wr.wr_refunded_hdemo_sk,\n        wr.wr_refunded_addr_sk,\n        wr.wr_returning_customer_sk,\n        wr.wr_returning_cdemo_sk,\n        wr.wr_returning_hdemo_sk,\n        wr.wr_returning_addr_sk,\n        wr.wr_web_page_sk,\n        wr.wr_reason_sk,\n        wr.wr_order_number,\n        wr.wr_return_quantity,\n        wr.wr_return_amt,\n        wr.wr_return_tax,\n        wr.wr_return_amt_inc_tax,\n        wr.wr_fee,\n        wr.wr_return_ship_cost,\n        wr.wr_refunded_cash,\n        wr.wr_reversed_charge,\n        wr.wr_account_credit,\n        wr.wr_net_loss,\n\t\twr.wr_returned_date_sk\n        where wr.wr_returned_date_sk is not null\ninsert overwrite table web_returns partition (wr_returned_date_sk)\nselect\n        wr.wr_returned_time_sk,\n        wr.wr_item_sk,\n        wr.wr_refunded_customer_sk,\n        wr.wr_refunded_cdemo_sk,\n        wr.wr_refunded_hdemo_sk,\n        wr.wr_refunded_addr_sk,\n        wr.wr_returning_customer_sk,\n        wr.wr_returning_cdemo_sk,\n        wr.wr_returning_hdemo_sk,\n        wr.wr_returning_addr_sk,\n        wr.wr_web_page_sk,\n        wr.wr_reason_sk,\n        wr.wr_order_number,\n        wr.wr_return_quantity,\n        wr.wr_return_amt,\n        wr.wr_return_tax,\n        wr.wr_return_amt_inc_tax,\n        wr.wr_fee,\n        wr.wr_return_ship_cost,\n        wr.wr_refunded_cash,\n        wr.wr_reversed_charge,\n        wr.wr_account_credit,\n        wr.wr_net_loss,\n\t\twr.wr_returned_date_sk\n        where wr.wr_returned_date_sk is null\n        sort by wr.wr_returned_date_sk \n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/web_sales.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists web_sales;\n\ncreate table web_sales\n(\n    ws_sold_time_sk           bigint,\n    ws_ship_date_sk           bigint,\n    ws_item_sk                bigint,\n    ws_bill_customer_sk       bigint,\n    ws_bill_cdemo_sk          bigint,\n    ws_bill_hdemo_sk          bigint,\n    ws_bill_addr_sk           bigint,\n    ws_ship_customer_sk       bigint,\n    ws_ship_cdemo_sk          bigint,\n    ws_ship_hdemo_sk          bigint,\n    ws_ship_addr_sk           bigint,\n    ws_web_page_sk            bigint,\n    ws_web_site_sk            bigint,\n    ws_ship_mode_sk           bigint,\n    ws_warehouse_sk           bigint,\n    ws_promo_sk               bigint,\n    ws_order_number           bigint,\n    ws_quantity               int,\n    ws_wholesale_cost         decimal(7,2),\n    ws_list_price             decimal(7,2),\n    ws_sales_price            decimal(7,2),\n    ws_ext_discount_amt       decimal(7,2),\n    ws_ext_sales_price        decimal(7,2),\n    ws_ext_wholesale_cost     decimal(7,2),\n    ws_ext_list_price         decimal(7,2),\n    ws_ext_tax                decimal(7,2),\n    ws_coupon_amt             decimal(7,2),\n    ws_ext_ship_cost          decimal(7,2),\n    ws_net_paid               decimal(7,2),\n    ws_net_paid_inc_tax       decimal(7,2),\n    ws_net_paid_inc_ship      decimal(7,2),\n    ws_net_paid_inc_ship_tax  decimal(7,2),\n    ws_net_profit             decimal(7,2)\n)\npartitioned by (ws_sold_date_sk           bigint)\nstored as ${FILE};\n\nfrom ${SOURCE}.web_sales ws\ninsert overwrite table web_sales partition (ws_sold_date_sk) \nselect\n        ws.ws_sold_time_sk,\n        ws.ws_ship_date_sk,\n        ws.ws_item_sk,\n        ws.ws_bill_customer_sk,\n        ws.ws_bill_cdemo_sk,\n        ws.ws_bill_hdemo_sk,\n        ws.ws_bill_addr_sk,\n        ws.ws_ship_customer_sk,\n        ws.ws_ship_cdemo_sk,\n        ws.ws_ship_hdemo_sk,\n        ws.ws_ship_addr_sk,\n        ws.ws_web_page_sk,\n        ws.ws_web_site_sk,\n        ws.ws_ship_mode_sk,\n        ws.ws_warehouse_sk,\n        ws.ws_promo_sk,\n        ws.ws_order_number,\n        ws.ws_quantity,\n        ws.ws_wholesale_cost,\n        ws.ws_list_price,\n        ws.ws_sales_price,\n        ws.ws_ext_discount_amt,\n        ws.ws_ext_sales_price,\n        ws.ws_ext_wholesale_cost,\n        ws.ws_ext_list_price,\n        ws.ws_ext_tax,\n        ws.ws_coupon_amt,\n        ws.ws_ext_ship_cost,\n        ws.ws_net_paid,\n        ws.ws_net_paid_inc_tax,\n        ws.ws_net_paid_inc_ship,\n        ws.ws_net_paid_inc_ship_tax,\n        ws.ws_net_profit,\n        ws.ws_sold_date_sk\n        where ws.ws_sold_date_sk is not null\ninsert overwrite table web_sales partition (ws_sold_date_sk) \nselect\n        ws.ws_sold_time_sk,\n        ws.ws_ship_date_sk,\n        ws.ws_item_sk,\n        ws.ws_bill_customer_sk,\n        ws.ws_bill_cdemo_sk,\n        ws.ws_bill_hdemo_sk,\n        ws.ws_bill_addr_sk,\n        ws.ws_ship_customer_sk,\n        ws.ws_ship_cdemo_sk,\n        ws.ws_ship_hdemo_sk,\n        ws.ws_ship_addr_sk,\n        ws.ws_web_page_sk,\n        ws.ws_web_site_sk,\n        ws.ws_ship_mode_sk,\n        ws.ws_warehouse_sk,\n        ws.ws_promo_sk,\n        ws.ws_order_number,\n        ws.ws_quantity,\n        ws.ws_wholesale_cost,\n        ws.ws_list_price,\n        ws.ws_sales_price,\n        ws.ws_ext_discount_amt,\n        ws.ws_ext_sales_price,\n        ws.ws_ext_wholesale_cost,\n        ws.ws_ext_list_price,\n        ws.ws_ext_tax,\n        ws.ws_coupon_amt,\n        ws.ws_ext_ship_cost,\n        ws.ws_net_paid,\n        ws.ws_net_paid_inc_tax,\n        ws.ws_net_paid_inc_ship,\n        ws.ws_net_paid_inc_ship_tax,\n        ws.ws_net_profit,\n        ws.ws_sold_date_sk\n        where ws.ws_sold_date_sk is null\n        sort by ws.ws_sold_date_sk\n;\n"
  },
  {
    "path": "ddl-tpcds/bin_partitioned/web_site.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists web_site;\n\ncreate table web_site\nstored as ${FILE}\nas select * from ${SOURCE}.web_site;\n"
  },
  {
    "path": "ddl-tpcds/text/alltables.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n-- Table<store_sales (23 cols)  partition=ss_sold_date_sk>\n\ndrop table if exists store_sales;\ncreate external table if not exists store_sales(\n      ss_sold_date_sk bigint\n,     ss_sold_time_sk bigint\n,     ss_item_sk bigint\n,     ss_customer_sk bigint\n,     ss_cdemo_sk bigint\n,     ss_hdemo_sk bigint\n,     ss_addr_sk bigint\n,     ss_store_sk bigint\n,     ss_promo_sk bigint\n,     ss_ticket_number bigint\n,     ss_quantity int\n,     ss_wholesale_cost decimal(7,2)\n,     ss_list_price decimal(7,2)\n,     ss_sales_price decimal(7,2)\n,     ss_ext_discount_amt decimal(7,2)\n,     ss_ext_sales_price decimal(7,2)\n,     ss_ext_wholesale_cost decimal(7,2)\n,     ss_ext_list_price decimal(7,2)\n,     ss_ext_tax decimal(7,2)\n,     ss_coupon_amt decimal(7,2)\n,     ss_net_paid decimal(7,2)\n,     ss_net_paid_inc_tax decimal(7,2)\n,     ss_net_profit decimal(7,2)  \n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/store_sales'\n;\n\n-- Table<store_returns (20 cols)  partition=sr_returned_date_sk>\n\ndrop table if exists store_returns;\ncreate external table if not exists store_returns(\n      sr_returned_date_sk bigint\n,     sr_return_time_sk bigint\n,     sr_item_sk bigint\n,     sr_customer_sk bigint\n,     sr_cdemo_sk bigint\n,     sr_hdemo_sk bigint\n,     sr_addr_sk bigint\n,     sr_store_sk bigint\n,     sr_reason_sk bigint\n,     sr_ticket_number bigint\n,     sr_return_quantity int\n,     sr_return_amt decimal(7,2)\n,     sr_return_tax decimal(7,2)\n,     sr_return_amt_inc_tax decimal(7,2)\n,     sr_fee decimal(7,2)\n,     sr_return_ship_cost decimal(7,2)\n,     sr_refunded_cash decimal(7,2)\n,     sr_reversed_charge decimal(7,2)\n,     sr_store_credit decimal(7,2)\n,     sr_net_loss decimal(7,2)\n)\nrow format delimited fields terminated by '|' \nlocation '${LOCATION}/store_returns'\n;\n\n-- Table<catalog_sales (34 cols)  partition=cs_sold_date_sk>\n\ndrop table if exists catalog_sales;\ncreate external table if not exists catalog_sales(\n      cs_sold_date_sk bigint\n,     cs_sold_time_sk bigint\n,     cs_ship_date_sk bigint\n,     cs_bill_customer_sk bigint\n,     cs_bill_cdemo_sk bigint\n,     cs_bill_hdemo_sk bigint\n,     cs_bill_addr_sk bigint\n,     cs_ship_customer_sk bigint\n,     cs_ship_cdemo_sk bigint\n,     cs_ship_hdemo_sk bigint\n,     cs_ship_addr_sk bigint\n,     cs_call_center_sk bigint\n,     cs_catalog_page_sk bigint\n,     cs_ship_mode_sk bigint\n,     cs_warehouse_sk bigint\n,     cs_item_sk bigint\n,     cs_promo_sk bigint\n,     cs_order_number bigint\n,     cs_quantity int\n,     cs_wholesale_cost decimal(7,2)\n,     cs_list_price decimal(7,2)\n,     cs_sales_price decimal(7,2)\n,     cs_ext_discount_amt decimal(7,2)\n,     cs_ext_sales_price decimal(7,2)\n,     cs_ext_wholesale_cost decimal(7,2)\n,     cs_ext_list_price decimal(7,2)\n,     cs_ext_tax decimal(7,2)\n,     cs_coupon_amt decimal(7,2)\n,     cs_ext_ship_cost decimal(7,2)\n,     cs_net_paid decimal(7,2)\n,     cs_net_paid_inc_tax decimal(7,2)\n,     cs_net_paid_inc_ship decimal(7,2)\n,     cs_net_paid_inc_ship_tax decimal(7,2)\n,     cs_net_profit decimal(7,2)\n)\nrow format delimited fields terminated by '|' \nlocation '${LOCATION}/catalog_sales'\n;\n\n-- Table<catalog_returns (27 cols)  partition=cr_returned_date_sk>\n\ndrop table if exists catalog_returns;\ncreate external table if not exists catalog_returns(\n      cr_returned_date_sk bigint\n,     cr_returned_time_sk bigint\n,     cr_item_sk bigint\n,     cr_refunded_customer_sk bigint\n,     cr_refunded_cdemo_sk bigint\n,     cr_refunded_hdemo_sk bigint\n,     cr_refunded_addr_sk bigint\n,     cr_returning_customer_sk bigint\n,     cr_returning_cdemo_sk bigint\n,     cr_returning_hdemo_sk bigint\n,     cr_returning_addr_sk bigint\n,     cr_call_center_sk bigint\n,     cr_catalog_page_sk bigint\n,     cr_ship_mode_sk bigint\n,     cr_warehouse_sk bigint\n,     cr_reason_sk bigint\n,     cr_order_number bigint\n,     cr_return_quantity int\n,     cr_return_amount decimal(7,2)\n,     cr_return_tax decimal(7,2)\n,     cr_return_amt_inc_tax decimal(7,2)\n,     cr_fee decimal(7,2)\n,     cr_return_ship_cost decimal(7,2)\n,     cr_refunded_cash decimal(7,2)\n,     cr_reversed_charge decimal(7,2)\n,     cr_store_credit decimal(7,2)\n,     cr_net_loss decimal(7,2)  \n)\nrow format delimited fields terminated by '|' \nlocation '${LOCATION}/catalog_returns'\n;\n\n-- Table<web_sales (34 cols)  partition=ws_sold_date_sk>\n\ndrop table if exists web_sales;\ncreate external table if not exists web_sales(\n      ws_sold_date_sk bigint\n,     ws_sold_time_sk bigint\n,     ws_ship_date_sk bigint\n,     ws_item_sk bigint\n,     ws_bill_customer_sk bigint\n,     ws_bill_cdemo_sk bigint\n,     ws_bill_hdemo_sk bigint\n,     ws_bill_addr_sk bigint\n,     ws_ship_customer_sk bigint\n,     ws_ship_cdemo_sk bigint\n,     ws_ship_hdemo_sk bigint\n,     ws_ship_addr_sk bigint\n,     ws_web_page_sk bigint\n,     ws_web_site_sk bigint\n,     ws_ship_mode_sk bigint\n,     ws_warehouse_sk bigint\n,     ws_promo_sk bigint\n,     ws_order_number bigint\n,     ws_quantity int\n,     ws_wholesale_cost decimal(7,2)\n,     ws_list_price decimal(7,2)\n,     ws_sales_price decimal(7,2)\n,     ws_ext_discount_amt decimal(7,2)\n,     ws_ext_sales_price decimal(7,2)\n,     ws_ext_wholesale_cost decimal(7,2)\n,     ws_ext_list_price decimal(7,2)\n,     ws_ext_tax decimal(7,2)\n,     ws_coupon_amt decimal(7,2)\n,     ws_ext_ship_cost decimal(7,2)\n,     ws_net_paid decimal(7,2)\n,     ws_net_paid_inc_tax decimal(7,2)\n,     ws_net_paid_inc_ship decimal(7,2)\n,     ws_net_paid_inc_ship_tax decimal(7,2)\n,     ws_net_profit decimal(7,2)\n)\nrow format delimited fields terminated by '|' \nlocation '${LOCATION}/web_sales'\n;\n\n-- Table<web_returns (24 cols)  partition=wr_returned_date_sk>\n\ndrop table if exists web_returns;\ncreate external table if not exists web_returns(\n      wr_returned_date_sk bigint\n,     wr_returned_time_sk bigint\n,     wr_item_sk bigint\n,     wr_refunded_customer_sk bigint\n,     wr_refunded_cdemo_sk bigint\n,     wr_refunded_hdemo_sk bigint\n,     wr_refunded_addr_sk bigint\n,     wr_returning_customer_sk bigint\n,     wr_returning_cdemo_sk bigint\n,     wr_returning_hdemo_sk bigint\n,     wr_returning_addr_sk bigint\n,     wr_web_page_sk bigint\n,     wr_reason_sk bigint\n,     wr_order_number bigint\n,     wr_return_quantity int\n,     wr_return_amt decimal(7,2)\n,     wr_return_tax decimal(7,2)\n,     wr_return_amt_inc_tax decimal(7,2)\n,     wr_fee decimal(7,2)\n,     wr_return_ship_cost decimal(7,2)\n,     wr_refunded_cash decimal(7,2)\n,     wr_reversed_charge decimal(7,2)\n,     wr_account_credit decimal(7,2)\n,     wr_net_loss decimal(7,2) \n)\nrow format delimited fields terminated by '|' \nlocation '${LOCATION}/web_returns'\n;\n\n-- Table<inventory (4 cols)>\n\ndrop table if exists inventory;\ncreate external table if not exists inventory(\n      inv_date_sk bigint\n,     inv_item_sk bigint\n,     inv_warehouse_sk bigint\n,     inv_quantity_on_hand int\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/inventory';\n\n-- Table<store (29 cols)>\n\ndrop table if exists store;\ncreate external table if not exists store(\n      s_store_sk bigint\n,     s_store_id char(16)\n,     s_rec_start_date date\n,     s_rec_end_date date\n,     s_closed_date_sk bigint\n,     s_store_name varchar(50)\n,     s_number_employees int\n,     s_floor_space int\n,     s_hours char(20)\n,     S_manager varchar(40)\n,     S_market_id int\n,     S_geography_class varchar(100)\n,     S_market_desc varchar(100)\n,     s_market_manager varchar(40)\n,     s_division_id int\n,     s_division_name varchar(50)\n,     s_company_id int\n,     s_company_name varchar(50)\n,     s_street_number varchar(10)\n,     s_street_name varchar(60)\n,     s_street_type char(15)\n,     s_suite_number char(10)\n,     s_city varchar(60)\n,     s_county varchar(30)\n,     s_state char(2)\n,     s_zip char(10)\n,     s_country varchar(20)\n,     s_gmt_offset decimal(5,2)\n,     s_tax_percentage decimal(5,2)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/store'\ntblproperties ('serialization.null.format'='');\n\n-- Table<call_center (31 cols)>\n\ndrop table if exists call_center;\ncreate external table if not exists call_center(\n      cc_call_center_sk bigint\n,     cc_call_center_id char(16)\n,     cc_rec_start_date date\n,     cc_rec_end_date date\n,     cc_closed_date_sk bigint\n,     cc_open_date_sk bigint\n,     cc_name varchar(50)\n,     cc_class varchar(50)\n,     cc_employees int\n,     cc_sq_ft int\n,     cc_hours char(20)\n,     cc_manager varchar(40)\n,     cc_mkt_id int\n,     cc_mkt_class char(50)\n,     cc_mkt_desc varchar(100)\n,     cc_market_manager varchar(40)\n,     cc_division int\n,     cc_division_name varchar(50)\n,     cc_company int\n,     cc_company_name char(50)\n,     cc_street_number char(10)\n,     cc_street_name varchar(60)\n,     cc_street_type char(15)\n,     cc_suite_number char(10)\n,     cc_city varchar(60)\n,     cc_county varchar(30)\n,     cc_state char(2)\n,     cc_zip char(10)\n,     cc_country varchar(20)\n,     cc_gmt_offset decimal(5,2)\n,     cc_tax_percentage decimal(5,2)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/call_center'\ntblproperties ('serialization.null.format'='');\n\n-- Table<catalog_page (9 cols)>\n\ndrop table if exists catalog_page;\ncreate external table if not exists catalog_page(\n      cp_catalog_page_sk bigint\n,     cp_catalog_page_id char(16)\n,     cp_start_date_sk bigint\n,     cp_end_date_sk bigint\n,     cp_department varchar(50)\n,     cp_catalog_number int\n,     cp_catalog_page_number int\n,     cp_description varchar(100)\n,     cp_type varchar(100)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/catalog_page'\ntblproperties ('serialization.null.format'='');\n\n-- Table<web_site (26 cols)>\n\ndrop table if exists web_site;\ncreate external table if not exists web_site(\n      web_site_sk bigint\n,     web_site_id char(16)\n,     web_rec_start_date date\n,     web_rec_end_date date\n,     web_name varchar(50)\n,     web_open_date_sk bigint\n,     web_close_date_sk bigint\n,     web_class varchar(50)\n,     web_manager varchar(40)\n,     web_mkt_id int\n,     web_mkt_class varchar(50)\n,     web_mkt_desc varchar(100)\n,     web_market_manager varchar(40)\n,     web_company_id int\n,     web_company_name char(50)\n,     web_street_number char(10)\n,     web_street_name varchar(60)\n,     web_street_type char(15)\n,     web_suite_number char(10)\n,     web_city varchar(60)\n,     web_county varchar(30)\n,     web_state char(2)\n,     web_zip char(10)\n,     web_country varchar(20)\n,     web_gmt_offset decimal(5,2)  \n,     web_tax_percentage decimal(5,2)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/web_site'\ntblproperties ('serialization.null.format'='');\n\n-- Table<web_page (14 cols)>\n\ndrop table if exists web_page;\ncreate external table if not exists web_page(\n      wp_web_page_sk bigint\n,     wp_web_page_id char(16)\n,     wp_rec_start_date date\n,     wp_rec_end_date date\n,     wp_creation_date_sk bigint\n,     wp_access_date_sk bigint\n,     wp_autogen_flag char(1)\n,     wp_customer_sk bigint\n,     wp_url varchar(100)\n,     wp_type char(50)\n,     wp_char_count int\n,     wp_link_count int\n,     wp_image_count int\n,     wp_max_ad_count int\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/web_page'\ntblproperties ('serialization.null.format'='');\n\n-- Table<warehouse (14 cols)>\n\ndrop table if exists warehouse;\ncreate external table if not exists warehouse(\n      w_warehouse_sk bigint\n,     w_warehouse_id char(16)\n,     w_warehouse_name varchar(20)\n,     w_warehouse_sq_ft int\n,     w_street_number char(10)\n,     w_street_name varchar(60)\n,     w_street_type char(15)\n,     w_suite_number char(10)\n,     w_city varchar(60)\n,     w_county varchar(30)\n,     w_state char(2)\n,     w_zip char(10)\n,     w_country varchar(20)\n,     w_gmt_offset decimal(5,2)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/warehouse'\ntblproperties ('serialization.null.format'='');\n\n-- Table<customer (18 cols)>\n\ndrop table if exists customer;\ncreate external table if not exists customer(\n      c_customer_sk bigint\n,     c_customer_id char(16)\n,     c_current_cdemo_sk bigint\n,     c_current_hdemo_sk bigint\n,     c_current_addr_sk bigint\n,     c_first_shipto_date_sk bigint\n,     c_first_sales_date_sk bigint\n,     c_salutation char(10)\n,     c_first_name char(20)\n,     c_last_name char(30)\n,     c_preferred_cust_flag char(1)\n,     c_birth_day int\n,     c_birth_month int\n,     c_birth_year int\n,     c_birth_country varchar(20)\n,     c_login char(13)\n,     c_email_address char(50)\n,     c_last_review_date_sk bigint\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/customer'\ntblproperties ('serialization.null.format'='');\n\n-- Table<customer_address (13 cols)>\n\ndrop table if exists customer_address;\ncreate external table if not exists customer_address(\n      ca_address_sk bigint\n,     ca_address_id char(16)\n,     ca_street_number char(10)\n,     ca_street_name varchar(60)\n,     ca_street_type char(15)\n,     ca_suite_number char(10)\n,     ca_city varchar(60)\n,     ca_county varchar(30)\n,     ca_state char(2)\n,     ca_zip char(10)\n,     ca_country varchar(20)\n,     ca_gmt_offset decimal(5,2)\n,     ca_location_type char(20)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/customer_address'\ntblproperties ('serialization.null.format'='');\n\n-- Table<customer_demographics (9 cols)>\n\ndrop table if exists customer_demographics;\ncreate external table if not exists customer_demographics(\n      cd_demo_sk bigint\n,     cd_gender char(1)\n,     cd_marital_status char(1)\n,     cd_education_status char(20)\n,     cd_purchase_estimate int\n,     cd_credit_rating char(10)\n,     cd_dep_count int\n,     cd_dep_employed_count int\n,     cd_dep_college_count int\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/customer_demographics'\ntblproperties ('serialization.null.format'='');\n\n-- Table<date_dim (28 cols)>\n\ndrop table if exists date_dim;\ncreate external table if not exists date_dim(\n      d_date_sk bigint\n,     d_date_id char(16)\n,     d_date date\n,     d_month_seq int\n,     d_week_seq int\n,     d_quarter_seq int\n,     d_year int\n,     d_dow int\n,     d_moy int\n,     d_dom int\n,     d_qoy int\n,     d_fy_year int\n,     d_fy_quarter_seq int\n,     d_fy_week_seq int\n,     d_day_name char(9)\n,     d_quarter_name char(6)\n,     d_holiday char(1)\n,     d_weekend char(1)\n,     d_following_holiday char(1)\n,     d_first_dom int\n,     d_last_dom int\n,     d_same_day_ly int\n,     d_same_day_lq int\n,     d_current_day char(1)\n,     d_current_week char(1)\n,     d_current_month char(1)\n,     d_current_quarter char(1)\n,     d_current_year char(1)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/date_dim'\ntblproperties ('serialization.null.format'='');\n\n-- Table<household_demographics (5 cols)>\n\ndrop table if exists household_demographics;\ncreate external table if not exists household_demographics(\n      hd_demo_sk bigint\n,     hd_income_band_sk bigint\n,     hd_buy_potential char(15)\n,     hd_dep_count int\n,     hd_vehicle_count int\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/household_demographics'\ntblproperties ('serialization.null.format'='');\n\n-- Table<item (22 cols)>\n\ndrop table if exists item;\ncreate external table if not exists item(\n      i_item_sk bigint\n,     i_item_id char(16)\n,     i_rec_start_date date\n,     i_rec_end_date date\n,     i_item_desc varchar(200)\n,     i_current_price decimal(7,2)\n,     i_wholesale_cost decimal(7,2)\n,     i_brand_id int\n,     i_brand char(50)\n,     i_class_id int\n,     i_class char(50)\n,     i_category_id int\n,     i_category char(50)\n,     i_manufact_id int\n,     i_manufact char(50)\n,     i_size char(20)\n,     i_formulation char(20)\n,     i_color char(20)\n,     i_units char(10)\n,     i_container char(10)\n,     i_manager_id int\n,     i_product_name char(50)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/item'\ntblproperties ('serialization.null.format'='');\n\n-- Table<income_band (3 cols)>\n\ndrop table if exists income_band;\ncreate external table if not exists income_band(\n      ib_income_band_sk bigint\n,     ib_lower_bound int\n,     ib_upper_bound int\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/income_band';\n\n-- Table<promotion (19 cols)>\n\ndrop table if exists promotion;\ncreate external table if not exists promotion(\n      p_promo_sk bigint\n,     p_promo_id char(16)\n,     p_start_date_sk bigint\n,     p_end_date_sk bigint\n,     p_item_sk bigint\n,     p_cost decimal(15,2)\n,     p_response_target int\n,     p_promo_name char(50)\n,     p_channel_dmail char(1)\n,     p_channel_email char(1)\n,     p_channel_catalog char(1)\n,     p_channel_tv char(1)\n,     p_channel_radio char(1)\n,     p_channel_press char(1)\n,     p_channel_event char(1)\n,     p_channel_demo char(1)\n,     p_channel_details varchar(100)\n,     p_purpose char(15)\n,     p_discount_active char(1)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/promotion'\ntblproperties ('serialization.null.format'='');\n\n-- Table<reason (3 cols)>\n\ndrop table if exists reason;\ncreate external table if not exists reason(\n      r_reason_sk bigint\n,     r_reason_id char(16)\n,     r_reason_desc char(100)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/reason'\ntblproperties ('serialization.null.format'='');\n\n-- Table<ship_mode (6 cols)>\n\ndrop table if exists ship_mode;\ncreate external table if not exists ship_mode(\n      sm_ship_mode_sk bigint\n,     sm_ship_mode_id char(16)\n,     sm_type char(30)\n,     sm_code char(10)\n,     sm_carrier char(20)\n,     sm_contract char(20)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/ship_mode'\ntblproperties ('serialization.null.format'='');\n\n-- Table<time_dim (10 cols)>\n\ndrop table if exists time_dim;\ncreate external table if not exists time_dim(\n      t_time_sk bigint\n,     t_time_id char(16)\n,     t_time int\n,     t_hour int\n,     t_minute int\n,     t_second int\n,     t_am_pm char(2)\n,     t_shift char(20)\n,     t_sub_shift char(20)\n,     t_meal_time char(20)\n)\nrow format delimited fields terminated by '|'\nlocation '${LOCATION}/time_dim'\ntblproperties ('serialization.null.format'='');\n\n\n"
  },
  {
    "path": "ddl-tpcds/text/analyze_everything.sql",
    "content": "analyze table call_center compute statistics for columns;\nanalyze table catalog_page compute statistics for columns;\nanalyze table catalog_returns compute statistics for columns;\nanalyze table catalog_sales compute statistics for columns;\nanalyze table customer compute statistics for columns;\nanalyze table customer_address compute statistics for columns;\nanalyze table customer_demographics compute statistics for columns;\nanalyze table date_dim compute statistics for columns;\nanalyze table household_demographics compute statistics for columns;\nanalyze table income_band compute statistics for columns;\nanalyze table inventory compute statistics for columns;\nanalyze table item compute statistics for columns;\nanalyze table promotion compute statistics for columns;\nanalyze table reason compute statistics for columns;\nanalyze table ship_mode compute statistics for columns;\nanalyze table store compute statistics for columns;\nanalyze table store_returns compute statistics for columns;\nanalyze table store_sales compute statistics for columns;\nanalyze table time_dim compute statistics for columns;\nanalyze table warehouse compute statistics for columns;\nanalyze table web_page compute statistics for columns;\nanalyze table web_returns compute statistics for columns;\nanalyze table web_sales compute statistics for columns;\nanalyze table web_site compute statistics for columns;"
  },
  {
    "path": "ddl-tpch/bin_flat/alltables.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists lineitem;\ncreate external table lineitem \n(L_ORDERKEY BIGINT,\n L_PARTKEY BIGINT,\n L_SUPPKEY BIGINT,\n L_LINENUMBER INT,\n L_QUANTITY DOUBLE,\n L_EXTENDEDPRICE DOUBLE,\n L_DISCOUNT DOUBLE,\n L_TAX DOUBLE,\n L_RETURNFLAG STRING,\n L_LINESTATUS STRING,\n L_SHIPDATE STRING,\n L_COMMITDATE STRING,\n L_RECEIPTDATE STRING,\n L_SHIPINSTRUCT STRING,\n L_SHIPMODE STRING,\n L_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE \nLOCATION '${LOCATION}/lineitem';\n\ndrop table if exists part;\ncreate external table part (P_PARTKEY BIGINT,\n P_NAME STRING,\n P_MFGR STRING,\n P_BRAND STRING,\n P_TYPE STRING,\n P_SIZE INT,\n P_CONTAINER STRING,\n P_RETAILPRICE DOUBLE,\n P_COMMENT STRING) \nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE \nLOCATION '${LOCATION}/part/';\n\ndrop table if exists supplier;\ncreate external table supplier (S_SUPPKEY BIGINT,\n S_NAME STRING,\n S_ADDRESS STRING,\n S_NATIONKEY BIGINT,\n S_PHONE STRING,\n S_ACCTBAL DOUBLE,\n S_COMMENT STRING) \nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE \nLOCATION '${LOCATION}/supplier/';\n\ndrop table if exists partsupp;\ncreate external table partsupp (PS_PARTKEY BIGINT,\n PS_SUPPKEY BIGINT,\n PS_AVAILQTY INT,\n PS_SUPPLYCOST DOUBLE,\n PS_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION'${LOCATION}/partsupp';\n\ndrop table if exists nation;\ncreate external table nation (N_NATIONKEY BIGINT,\n N_NAME STRING,\n N_REGIONKEY BIGINT,\n N_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/nation';\n\ndrop table if exists region;\ncreate external table region (R_REGIONKEY BIGINT,\n R_NAME STRING,\n R_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/region';\n\ndrop table if exists customer;\ncreate external table customer (C_CUSTKEY BIGINT,\n C_NAME STRING,\n C_ADDRESS STRING,\n C_NATIONKEY BIGINT,\n C_PHONE STRING,\n C_ACCTBAL DOUBLE,\n C_MKTSEGMENT STRING,\n C_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/customer';\n\ndrop table if exists orders;\ncreate external table orders (O_ORDERKEY BIGINT,\n O_CUSTKEY BIGINT,\n O_ORDERSTATUS STRING,\n O_TOTALPRICE DOUBLE,\n O_ORDERDATE STRING,\n O_ORDERPRIORITY STRING,\n O_CLERK STRING,\n O_SHIPPRIORITY INT,\n O_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/orders';\n"
  },
  {
    "path": "ddl-tpch/bin_flat/analyze.sql",
    "content": "analyze table nation compute statistics for columns;\nanalyze table region compute statistics for columns;\nanalyze table supplier compute statistics for columns;\nanalyze table part compute statistics for columns;\nanalyze table partsupp compute statistics for columns;\nanalyze table customer compute statistics for columns;\nanalyze table orders compute statistics for columns;\nanalyze table lineitem compute statistics for columns;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/customer.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists customer;\n\ncreate table customer\nstored as ${FILE}\nas select * from ${SOURCE}.customer\ncluster by C_MKTSEGMENT\n;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/lineitem.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists lineitem;\n\ncreate table lineitem\nstored as ${FILE}\nas select * from ${SOURCE}.lineitem \ncluster by L_SHIPDATE\n;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/nation.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists nation;\n\ncreate table nation\nstored as ${FILE}\nas select distinct * from ${SOURCE}.nation;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/orders.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists orders;\n\ncreate table orders\nstored as ${FILE}\nas select * from ${SOURCE}.orders\ncluster by o_orderdate\n;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/part.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists part;\n\ncreate table part\nstored as ${FILE}\nas select * from ${SOURCE}.part\ncluster by p_brand\n;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/partsupp.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists partsupp;\n\ncreate table partsupp\nstored as ${FILE}\nas select * from ${SOURCE}.partsupp\ncluster by PS_SUPPKEY\n;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/region.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists region;\n\ncreate table region\nstored as ${FILE}\nas select distinct * from ${SOURCE}.region;\n"
  },
  {
    "path": "ddl-tpch/bin_flat/supplier.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists supplier;\n\ncreate table supplier\nstored as ${FILE}\nas select * from ${SOURCE}.supplier\ncluster by s_nationkey, s_suppkey\n;\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/analyze.sql",
    "content": "analyze table nation compute statistics for columns;\nanalyze table region compute statistics for columns;\nanalyze table supplier compute statistics for columns;\nanalyze table part compute statistics for columns;\nanalyze table partsupp compute statistics for columns;\nanalyze table customer compute statistics for columns;\nanalyze table orders compute statistics for columns;\nanalyze table lineitem compute statistics for columns;\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/customer.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists customer;\n\ncreate table customer\nstored as ${FILE}\nTBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB')\nas select * from ${SOURCE}.customer\ncluster by C_MKTSEGMENT\n;\n\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/lineitem.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists lineitem;\n\ncreate table lineitem \n(L_ORDERKEY BIGINT,\n L_PARTKEY BIGINT,\n L_SUPPKEY BIGINT,\n L_LINENUMBER INT,\n L_QUANTITY DOUBLE,\n L_EXTENDEDPRICE DOUBLE,\n L_DISCOUNT DOUBLE,\n L_TAX DOUBLE,\n L_RETURNFLAG STRING,\n L_LINESTATUS STRING,\n L_COMMITDATE STRING,\n L_RECEIPTDATE STRING,\n L_SHIPINSTRUCT STRING,\n L_SHIPMODE STRING,\n L_COMMENT STRING)\n partitioned by (L_SHIPDATE STRING)\nstored as ${FILE}\n;\n\nALTER TABLE lineitem SET TBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB');\n\nINSERT OVERWRITE TABLE lineitem Partition(L_SHIPDATE)\nselect \nL_ORDERKEY ,\n L_PARTKEY ,\n L_SUPPKEY ,\n L_LINENUMBER ,\n L_QUANTITY ,\n L_EXTENDEDPRICE ,\n L_DISCOUNT ,\n L_TAX ,\n L_RETURNFLAG ,\n L_LINESTATUS ,\n L_COMMITDATE ,\n L_RECEIPTDATE ,\n L_SHIPINSTRUCT ,\n L_SHIPMODE ,\n L_COMMENT ,\n L_SHIPDATE\n from ${SOURCE}.lineitem\n;\n\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/nation.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists nation;\n\ncreate table nation\nstored as ${FILE}\nTBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB')\nas select distinct * from ${SOURCE}.nation;\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/orders.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists orders;\n\ncreate table orders (O_ORDERKEY BIGINT,\n O_CUSTKEY BIGINT,\n O_ORDERSTATUS STRING,\n O_TOTALPRICE DOUBLE,\n O_ORDERPRIORITY STRING,\n O_CLERK STRING,\n O_SHIPPRIORITY INT,\n O_COMMENT STRING)\n partitioned by (O_ORDERDATE STRING)\nstored as ${FILE}\n;\n\nALTER TABLE orders SET TBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB');\n\nINSERT OVERWRITE TABLE orders partition(O_ORDERDATE)\nselect \nO_ORDERKEY ,\n O_CUSTKEY ,\n O_ORDERSTATUS ,\n O_TOTALPRICE ,\n O_ORDERPRIORITY ,\n O_CLERK ,\n O_SHIPPRIORITY ,\n O_COMMENT,\n O_ORDERDATE\n  from ${SOURCE}.orders\n;\n\n\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/part.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists part;\n\ncreate table part\nstored as ${FILE}\nTBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB')\nas select * from ${SOURCE}.part\ncluster by p_brand\n;\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/partsupp.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists partsupp;\n\ncreate table partsupp\nstored as ${FILE}\nTBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB')\nas select * from ${SOURCE}.partsupp\ncluster by PS_SUPPKEY\n;\n\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/region.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists region;\n\ncreate table region\nstored as ${FILE}\nTBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB')\nas select distinct * from ${SOURCE}.region;\n"
  },
  {
    "path": "ddl-tpch/bin_partitioned/supplier.sql",
    "content": "create database if not exists ${DB};\nuse ${DB};\n\ndrop table if exists supplier;\n\ncreate table supplier\nstored as ${FILE}\nTBLPROPERTIES('orc.bloom.filter.columns'='*','orc.compress'='ZLIB')\nas select * from ${SOURCE}.supplier\ncluster by s_nationkey, s_suppkey\n;\n"
  },
  {
    "path": "runSuite.pl",
    "content": "#!/usr/bin/perl\n\nuse strict;\nuse warnings;\nuse File::Basename;\n\n# PROTOTYPES\nsub dieWithUsage(;$);\n\n# GLOBALS\nmy $SCRIPT_NAME = basename( __FILE__ );\nmy $SCRIPT_PATH = dirname( __FILE__ );\n\n# MAIN\ndieWithUsage(\"one or more parameters not defined\") unless @ARGV >= 1;\nmy $suite = shift;\nmy $scale = shift || 2;\ndieWithUsage(\"suite name required\") unless $suite eq \"tpcds\" or $suite eq \"tpch\";\n\nchdir $SCRIPT_PATH;\nif( $suite eq 'tpcds' ) {\n\tchdir \"sample-queries-tpcds\";\n} else {\n\tchdir 'sample-queries-tpch';\n} # end if\nmy @queries = glob '*.sql';\n\nmy $db = { \n\t'tpcds' => \"tpcds_bin_partitioned_orc_$scale\",\n\t'tpch' => \"tpch_flat_orc_$scale\"\n};\n\nprint \"filename,status,time,rows\\n\";\nfor my $query ( @queries ) {\n\tmy $logname = \"$query.log\";\n\tmy $cmd=\"echo 'use $db->{${suite}}; source $query;' | hive -i testbench.settings 2>&1  | tee $query.log\";\n#\tmy $cmd=\"cat $query.log\";\n\t#print $cmd ; exit;\n\t\n\tmy $hiveStart = time();\n\n\tmy @hiveoutput=`$cmd`;\n\tdie \"${SCRIPT_NAME}:: ERROR:  hive command unexpectedly exited \\$? = '$?', \\$! = '$!'\" if $?;\n\n\tmy $hiveEnd = time();\n\tmy $hiveTime = $hiveEnd - $hiveStart;\n\tforeach my $line ( @hiveoutput ) {\n\t\tif( $line =~ /Time taken:\\s+([\\d\\.]+)\\s+seconds,\\s+Fetched:\\s+(\\d+)\\s+row/ ) {\n\t\t\tprint \"$query,success,$hiveTime,$2\\n\"; \n\t\t} elsif( \n\t\t\t$line =~ /^FAILED: /\n\t\t\t# || /Task failed!/ \n\t\t\t) {\n\t\t\tprint \"$query,failed,$hiveTime\\n\"; \n\t\t} # end if\n\t} # end while\n} # end for\n\n\nsub dieWithUsage(;$) {\n\tmy $err = shift || '';\n\tif( $err ne '' ) {\n\t\tchomp $err;\n\t\t$err = \"ERROR: $err\\n\\n\";\n\t} # end if\n\n\tprint STDERR <<USAGE;\n${err}Usage:\n\tperl ${SCRIPT_NAME} [tpcds|tpch] [scale]\n\nDescription:\n\tThis script runs the sample queries and outputs a CSV file of the time it took each query to run.  Also, all hive output is kept as a log file named 'queryXX.sql.log' for each query file of the form 'queryXX.sql'. Defaults to scale of 2.\nUSAGE\n\texit 1;\n}\n\n"
  },
  {
    "path": "sample-queries-tpcds/README.md",
    "content": "Sample TPC-DS Queries\n=====================\n\nThis directory contains sample TPC-DS queries you can run once you have generated your data. Queries are compatible with HDP 2.6 and up.\n"
  },
  {
    "path": "sample-queries-tpcds/query1.sql",
    "content": "-- start query 1 in stream 0 using template query1.tpl and seed 2031708268\nwith customer_total_return as\n(select sr_customer_sk as ctr_customer_sk\n,sr_store_sk as ctr_store_sk\n,sum(SR_FEE) as ctr_total_return\nfrom store_returns\n,date_dim\nwhere sr_returned_date_sk = d_date_sk\nand d_year =2000\ngroup by sr_customer_sk\n,sr_store_sk)\n select  c_customer_id\nfrom customer_total_return ctr1\n,store\n,customer\nwhere ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2\nfrom customer_total_return ctr2\nwhere ctr1.ctr_store_sk = ctr2.ctr_store_sk)\nand s_store_sk = ctr1.ctr_store_sk\nand s_state = 'NM'\nand ctr1.ctr_customer_sk = c_customer_sk\norder by c_customer_id\nlimit 100;\n\n-- end query 1 in stream 0 using template query1.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query10.sql",
    "content": "-- start query 1 in stream 0 using template query10.tpl and seed 797269820\nselect  \n  cd_gender,\n  cd_marital_status,\n  cd_education_status,\n  count(*) cnt1,\n  cd_purchase_estimate,\n  count(*) cnt2,\n  cd_credit_rating,\n  count(*) cnt3,\n  cd_dep_count,\n  count(*) cnt4,\n  cd_dep_employed_count,\n  count(*) cnt5,\n  cd_dep_college_count,\n  count(*) cnt6\n from\n  customer c,customer_address ca,customer_demographics\n where\n  c.c_current_addr_sk = ca.ca_address_sk and\n  ca_county in ('Fillmore County','McPherson County','Bonneville County','Boone County','Brown County') and\n  cd_demo_sk = c.c_current_cdemo_sk and \n  exists (select *\n          from store_sales,date_dim\n          where c.c_customer_sk = ss_customer_sk and\n                ss_sold_date_sk = d_date_sk and\n                d_year = 2000 and\n                d_moy between 3 and 3+3) and\n   (exists (select *\n            from web_sales,date_dim\n            where c.c_customer_sk = ws_bill_customer_sk and\n                  ws_sold_date_sk = d_date_sk and\n                  d_year = 2000 and\n                  d_moy between 3 ANd 3+3) or \n    exists (select * \n            from catalog_sales,date_dim\n            where c.c_customer_sk = cs_ship_customer_sk and\n                  cs_sold_date_sk = d_date_sk and\n                  d_year = 2000 and\n                  d_moy between 3 and 3+3))\n group by cd_gender,\n          cd_marital_status,\n          cd_education_status,\n          cd_purchase_estimate,\n          cd_credit_rating,\n          cd_dep_count,\n          cd_dep_employed_count,\n          cd_dep_college_count\n order by cd_gender,\n          cd_marital_status,\n          cd_education_status,\n          cd_purchase_estimate,\n          cd_credit_rating,\n          cd_dep_count,\n          cd_dep_employed_count,\n          cd_dep_college_count\nlimit 100;\n\n-- end query 1 in stream 0 using template query10.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query11.sql",
    "content": "-- start query 1 in stream 0 using template query11.tpl and seed 1819994127\nwith year_total as (\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,c_preferred_cust_flag customer_preferred_cust_flag\n       ,c_birth_country customer_birth_country\n       ,c_login customer_login\n       ,c_email_address customer_email_address\n       ,d_year dyear\n       ,sum(ss_ext_list_price-ss_ext_discount_amt) year_total\n       ,'s' sale_type\n from customer\n     ,store_sales\n     ,date_dim\n where c_customer_sk = ss_customer_sk\n   and ss_sold_date_sk = d_date_sk\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,c_preferred_cust_flag \n         ,c_birth_country\n         ,c_login\n         ,c_email_address\n         ,d_year \n union all\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,c_preferred_cust_flag customer_preferred_cust_flag\n       ,c_birth_country customer_birth_country\n       ,c_login customer_login\n       ,c_email_address customer_email_address\n       ,d_year dyear\n       ,sum(ws_ext_list_price-ws_ext_discount_amt) year_total\n       ,'w' sale_type\n from customer\n     ,web_sales\n     ,date_dim\n where c_customer_sk = ws_bill_customer_sk\n   and ws_sold_date_sk = d_date_sk\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,c_preferred_cust_flag \n         ,c_birth_country\n         ,c_login\n         ,c_email_address\n         ,d_year\n         )\n  select  \n                  t_s_secyear.customer_id\n                 ,t_s_secyear.customer_first_name\n                 ,t_s_secyear.customer_last_name\n                 ,t_s_secyear.customer_birth_country\n from year_total t_s_firstyear\n     ,year_total t_s_secyear\n     ,year_total t_w_firstyear\n     ,year_total t_w_secyear\n where t_s_secyear.customer_id = t_s_firstyear.customer_id\n         and t_s_firstyear.customer_id = t_w_secyear.customer_id\n         and t_s_firstyear.customer_id = t_w_firstyear.customer_id\n         and t_s_firstyear.sale_type = 's'\n         and t_w_firstyear.sale_type = 'w'\n         and t_s_secyear.sale_type = 's'\n         and t_w_secyear.sale_type = 'w'\n         and t_s_firstyear.dyear = 1999\n         and t_s_secyear.dyear = 1999+1\n         and t_w_firstyear.dyear = 1999\n         and t_w_secyear.dyear = 1999+1\n         and t_s_firstyear.year_total > 0\n         and t_w_firstyear.year_total > 0\n         and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else 0.0 end\n             > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else 0.0 end\n order by t_s_secyear.customer_id\n         ,t_s_secyear.customer_first_name\n         ,t_s_secyear.customer_last_name\n         ,t_s_secyear.customer_birth_country\nlimit 100;\n\n-- end query 1 in stream 0 using template query11.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query12.sql",
    "content": "-- start query 1 in stream 0 using template query12.tpl and seed 345591136\nselect  i_item_id\n      ,i_item_desc \n      ,i_category \n      ,i_class \n      ,i_current_price\n      ,sum(ws_ext_sales_price) as itemrevenue \n      ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over\n          (partition by i_class) as revenueratio\nfrom\t\n\tweb_sales\n    \t,item \n    \t,date_dim\nwhere \n\tws_item_sk = i_item_sk \n  \tand i_category in ('Electronics', 'Books', 'Women')\n  \tand ws_sold_date_sk = d_date_sk\n\tand d_date between cast('1998-01-06' as date) \n\t\t\t\tand (cast('1998-01-06' as date) + 30 days)\ngroup by \n\ti_item_id\n        ,i_item_desc \n        ,i_category\n        ,i_class\n        ,i_current_price\norder by \n\ti_category\n        ,i_class\n        ,i_item_id\n        ,i_item_desc\n        ,revenueratio\nlimit 100;\n\n-- end query 1 in stream 0 using template query12.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query13.sql",
    "content": "-- start query 1 in stream 0 using template query13.tpl and seed 622697896\nselect avg(ss_quantity)\n       ,avg(ss_ext_sales_price)\n       ,avg(ss_ext_wholesale_cost)\n       ,sum(ss_ext_wholesale_cost)\n from store_sales\n     ,store\n     ,customer_demographics\n     ,household_demographics\n     ,customer_address\n     ,date_dim\n where s_store_sk = ss_store_sk\n and  ss_sold_date_sk = d_date_sk and d_year = 2001\n and((ss_hdemo_sk=hd_demo_sk\n  and cd_demo_sk = ss_cdemo_sk\n  and cd_marital_status = 'U'\n  and cd_education_status = 'Secondary'\n  and ss_sales_price between 100.00 and 150.00\n  and hd_dep_count = 3   \n     )or\n     (ss_hdemo_sk=hd_demo_sk\n  and cd_demo_sk = ss_cdemo_sk\n  and cd_marital_status = 'W'\n  and cd_education_status = 'College'\n  and ss_sales_price between 50.00 and 100.00   \n  and hd_dep_count = 1\n     ) or \n     (ss_hdemo_sk=hd_demo_sk\n  and cd_demo_sk = ss_cdemo_sk\n  and cd_marital_status = 'D'\n  and cd_education_status = 'Primary'\n  and ss_sales_price between 150.00 and 200.00 \n  and hd_dep_count = 1  \n     ))\n and((ss_addr_sk = ca_address_sk\n  and ca_country = 'United States'\n  and ca_state in ('TX', 'OK', 'MI')\n  and ss_net_profit between 100 and 200  \n     ) or\n     (ss_addr_sk = ca_address_sk\n  and ca_country = 'United States'\n  and ca_state in ('WA', 'NC', 'OH')\n  and ss_net_profit between 150 and 300  \n     ) or\n     (ss_addr_sk = ca_address_sk\n  and ca_country = 'United States'\n  and ca_state in ('MT', 'FL', 'GA')\n  and ss_net_profit between 50 and 250  \n     ))\n;\n\n-- end query 1 in stream 0 using template query13.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query14.sql",
    "content": "-- start query 1 in stream 0 using template query14.tpl and seed 1819994127\nwith  cross_items as\n (select i_item_sk ss_item_sk\n from item,\n (select iss.i_brand_id brand_id\n     ,iss.i_class_id class_id\n     ,iss.i_category_id category_id\n from store_sales\n     ,item iss\n     ,date_dim d1\n where ss_item_sk = iss.i_item_sk\n   and ss_sold_date_sk = d1.d_date_sk\n   and d1.d_year between 2000 AND 2000 + 2\n intersect \n select ics.i_brand_id\n     ,ics.i_class_id\n     ,ics.i_category_id\n from catalog_sales\n     ,item ics\n     ,date_dim d2\n where cs_item_sk = ics.i_item_sk\n   and cs_sold_date_sk = d2.d_date_sk\n   and d2.d_year between 2000 AND 2000 + 2\n intersect\n select iws.i_brand_id\n     ,iws.i_class_id\n     ,iws.i_category_id\n from web_sales\n     ,item iws\n     ,date_dim d3\n where ws_item_sk = iws.i_item_sk\n   and ws_sold_date_sk = d3.d_date_sk\n   and d3.d_year between 2000 AND 2000 + 2) x\n where i_brand_id = brand_id\n      and i_class_id = class_id\n      and i_category_id = category_id\n),\n avg_sales as\n (select avg(quantity*list_price) average_sales\n  from (select ss_quantity quantity\n             ,ss_list_price list_price\n       from store_sales\n           ,date_dim\n       where ss_sold_date_sk = d_date_sk\n         and d_year between 2000 and 2000 + 2\n       union all \n       select cs_quantity quantity \n             ,cs_list_price list_price\n       from catalog_sales\n           ,date_dim\n       where cs_sold_date_sk = d_date_sk\n         and d_year between 2000 and 2000 + 2 \n       union all\n       select ws_quantity quantity\n             ,ws_list_price list_price\n       from web_sales\n           ,date_dim\n       where ws_sold_date_sk = d_date_sk\n         and d_year between 2000 and 2000 + 2) x)\n  select  channel, i_brand_id,i_class_id,i_category_id,sum(sales), sum(number_sales)\n from(\n       select 'store' channel, i_brand_id,i_class_id\n             ,i_category_id,sum(ss_quantity*ss_list_price) sales\n             , count(*) number_sales\n       from store_sales\n           ,item\n           ,date_dim\n       where ss_item_sk in (select ss_item_sk from cross_items)\n         and ss_item_sk = i_item_sk\n         and ss_sold_date_sk = d_date_sk\n         and d_year = 2000+2 \n         and d_moy = 11\n       group by i_brand_id,i_class_id,i_category_id\n       having sum(ss_quantity*ss_list_price) > (select average_sales from avg_sales)\n       union all\n       select 'catalog' channel, i_brand_id,i_class_id,i_category_id, sum(cs_quantity*cs_list_price) sales, count(*) number_sales\n       from catalog_sales\n           ,item\n           ,date_dim\n       where cs_item_sk in (select ss_item_sk from cross_items)\n         and cs_item_sk = i_item_sk\n         and cs_sold_date_sk = d_date_sk\n         and d_year = 2000+2 \n         and d_moy = 11\n       group by i_brand_id,i_class_id,i_category_id\n       having sum(cs_quantity*cs_list_price) > (select average_sales from avg_sales)\n       union all\n       select 'web' channel, i_brand_id,i_class_id,i_category_id, sum(ws_quantity*ws_list_price) sales , count(*) number_sales\n       from web_sales\n           ,item\n           ,date_dim\n       where ws_item_sk in (select ss_item_sk from cross_items)\n         and ws_item_sk = i_item_sk\n         and ws_sold_date_sk = d_date_sk\n         and d_year = 2000+2\n         and d_moy = 11\n       group by i_brand_id,i_class_id,i_category_id\n       having sum(ws_quantity*ws_list_price) > (select average_sales from avg_sales)\n ) y\n group by rollup (channel, i_brand_id,i_class_id,i_category_id)\n order by channel,i_brand_id,i_class_id,i_category_id\n limit 100;\nwith  cross_items as\n (select i_item_sk ss_item_sk\n from item,\n (select iss.i_brand_id brand_id\n     ,iss.i_class_id class_id\n     ,iss.i_category_id category_id\n from store_sales\n     ,item iss\n     ,date_dim d1\n where ss_item_sk = iss.i_item_sk\n   and ss_sold_date_sk = d1.d_date_sk\n   and d1.d_year between 2000 AND 2000 + 2\n intersect\n select ics.i_brand_id\n     ,ics.i_class_id\n     ,ics.i_category_id\n from catalog_sales\n     ,item ics\n     ,date_dim d2\n where cs_item_sk = ics.i_item_sk\n   and cs_sold_date_sk = d2.d_date_sk\n   and d2.d_year between 2000 AND 2000 + 2\n intersect\n select iws.i_brand_id\n     ,iws.i_class_id\n     ,iws.i_category_id\n from web_sales\n     ,item iws\n     ,date_dim d3\n where ws_item_sk = iws.i_item_sk\n   and ws_sold_date_sk = d3.d_date_sk\n   and d3.d_year between 2000 AND 2000 + 2) x\n where i_brand_id = brand_id\n      and i_class_id = class_id\n      and i_category_id = category_id\n),\n avg_sales as\n(select avg(quantity*list_price) average_sales\n  from (select ss_quantity quantity\n             ,ss_list_price list_price\n       from store_sales\n           ,date_dim\n       where ss_sold_date_sk = d_date_sk\n         and d_year between 2000 and 2000 + 2\n       union all\n       select cs_quantity quantity\n             ,cs_list_price list_price\n       from catalog_sales\n           ,date_dim\n       where cs_sold_date_sk = d_date_sk\n         and d_year between 2000 and 2000 + 2\n       union all\n       select ws_quantity quantity\n             ,ws_list_price list_price\n       from web_sales\n           ,date_dim\n       where ws_sold_date_sk = d_date_sk\n         and d_year between 2000 and 2000 + 2) x)\n  select  this_year.channel ty_channel\n                           ,this_year.i_brand_id ty_brand\n                           ,this_year.i_class_id ty_class\n                           ,this_year.i_category_id ty_category\n                           ,this_year.sales ty_sales\n                           ,this_year.number_sales ty_number_sales\n                           ,last_year.channel ly_channel\n                           ,last_year.i_brand_id ly_brand\n                           ,last_year.i_class_id ly_class\n                           ,last_year.i_category_id ly_category\n                           ,last_year.sales ly_sales\n                           ,last_year.number_sales ly_number_sales \n from\n (select 'store' channel, i_brand_id,i_class_id,i_category_id\n        ,sum(ss_quantity*ss_list_price) sales, count(*) number_sales\n from store_sales \n     ,item\n     ,date_dim\n where ss_item_sk in (select ss_item_sk from cross_items)\n   and ss_item_sk = i_item_sk\n   and ss_sold_date_sk = d_date_sk\n   and d_week_seq = (select d_week_seq\n                     from date_dim\n                     where d_year = 2000 + 1\n                       and d_moy = 12\n                       and d_dom = 15)\n group by i_brand_id,i_class_id,i_category_id\n having sum(ss_quantity*ss_list_price) > (select average_sales from avg_sales)) this_year,\n (select 'store' channel, i_brand_id,i_class_id\n        ,i_category_id, sum(ss_quantity*ss_list_price) sales, count(*) number_sales\n from store_sales\n     ,item\n     ,date_dim\n where ss_item_sk in (select ss_item_sk from cross_items)\n   and ss_item_sk = i_item_sk\n   and ss_sold_date_sk = d_date_sk\n   and d_week_seq = (select d_week_seq\n                     from date_dim\n                     where d_year = 2000\n                       and d_moy = 12\n                       and d_dom = 15)\n group by i_brand_id,i_class_id,i_category_id\n having sum(ss_quantity*ss_list_price) > (select average_sales from avg_sales)) last_year\n where this_year.i_brand_id= last_year.i_brand_id\n   and this_year.i_class_id = last_year.i_class_id\n   and this_year.i_category_id = last_year.i_category_id\n order by this_year.channel, this_year.i_brand_id, this_year.i_class_id, this_year.i_category_id\n limit 100;\n\n-- end query 1 in stream 0 using template query14.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query15.sql",
    "content": "-- start query 1 in stream 0 using template query15.tpl and seed 1819994127\nselect  ca_zip\n       ,sum(cs_sales_price)\n from catalog_sales\n     ,customer\n     ,customer_address\n     ,date_dim\n where cs_bill_customer_sk = c_customer_sk\n \tand c_current_addr_sk = ca_address_sk \n \tand ( substr(ca_zip,1,5) in ('85669', '86197','88274','83405','86475',\n                                   '85392', '85460', '80348', '81792')\n \t      or ca_state in ('CA','WA','GA')\n \t      or cs_sales_price > 500)\n \tand cs_sold_date_sk = d_date_sk\n \tand d_qoy = 2 and d_year = 1998\n group by ca_zip\n order by ca_zip\n limit 100;\n\n-- end query 1 in stream 0 using template query15.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query16.sql",
    "content": "-- start query 1 in stream 0 using template query16.tpl and seed 171719422\nselect  \n   count(distinct cs_order_number) as `order count`\n  ,sum(cs_ext_ship_cost) as `total shipping cost`\n  ,sum(cs_net_profit) as `total net profit`\nfrom\n   catalog_sales cs1\n  ,date_dim\n  ,customer_address\n  ,call_center\nwhere\n    d_date between '1999-4-01' and \n           (cast('1999-4-01' as date) + 60 days)\nand cs1.cs_ship_date_sk = d_date_sk\nand cs1.cs_ship_addr_sk = ca_address_sk\nand ca_state = 'IL'\nand cs1.cs_call_center_sk = cc_call_center_sk\nand cc_county in ('Richland County','Bronx County','Maverick County','Mesa County',\n                  'Raleigh County'\n)\nand exists (select *\n            from catalog_sales cs2\n            where cs1.cs_order_number = cs2.cs_order_number\n              and cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk)\nand not exists(select *\n               from catalog_returns cr1\n               where cs1.cs_order_number = cr1.cr_order_number)\norder by count(distinct cs_order_number)\nlimit 100;\n\n-- end query 1 in stream 0 using template query16.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query17.sql",
    "content": "-- start query 1 in stream 0 using template query17.tpl and seed 1819994127\nselect  i_item_id\n       ,i_item_desc\n       ,s_state\n       ,count(ss_quantity) as store_sales_quantitycount\n       ,avg(ss_quantity) as store_sales_quantityave\n       ,stddev_samp(ss_quantity) as store_sales_quantitystdev\n       ,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov\n       ,count(sr_return_quantity) as store_returns_quantitycount\n       ,avg(sr_return_quantity) as store_returns_quantityave\n       ,stddev_samp(sr_return_quantity) as store_returns_quantitystdev\n       ,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov\n       ,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave\n       ,stddev_samp(cs_quantity) as catalog_sales_quantitystdev\n       ,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov\n from store_sales\n     ,store_returns\n     ,catalog_sales\n     ,date_dim d1\n     ,date_dim d2\n     ,date_dim d3\n     ,store\n     ,item\n where d1.d_quarter_name = '2000Q1'\n   and d1.d_date_sk = ss_sold_date_sk\n   and i_item_sk = ss_item_sk\n   and s_store_sk = ss_store_sk\n   and ss_customer_sk = sr_customer_sk\n   and ss_item_sk = sr_item_sk\n   and ss_ticket_number = sr_ticket_number\n   and sr_returned_date_sk = d2.d_date_sk\n   and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3')\n   and sr_customer_sk = cs_bill_customer_sk\n   and sr_item_sk = cs_item_sk\n   and cs_sold_date_sk = d3.d_date_sk\n   and d3.d_quarter_name in ('2000Q1','2000Q2','2000Q3')\n group by i_item_id\n         ,i_item_desc\n         ,s_state\n order by i_item_id\n         ,i_item_desc\n         ,s_state\nlimit 100;\n\n-- end query 1 in stream 0 using template query17.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query18.sql",
    "content": "-- start query 1 in stream 0 using template query18.tpl and seed 1978355063\nselect  i_item_id,\n        ca_country,\n        ca_state, \n        ca_county,\n        avg( cast(cs_quantity as decimal(12,2))) agg1,\n        avg( cast(cs_list_price as decimal(12,2))) agg2,\n        avg( cast(cs_coupon_amt as decimal(12,2))) agg3,\n        avg( cast(cs_sales_price as decimal(12,2))) agg4,\n        avg( cast(cs_net_profit as decimal(12,2))) agg5,\n        avg( cast(c_birth_year as decimal(12,2))) agg6,\n        avg( cast(cd1.cd_dep_count as decimal(12,2))) agg7\n from catalog_sales, customer_demographics cd1, \n      customer_demographics cd2, customer, customer_address, date_dim, item\n where cs_sold_date_sk = d_date_sk and\n       cs_item_sk = i_item_sk and\n       cs_bill_cdemo_sk = cd1.cd_demo_sk and\n       cs_bill_customer_sk = c_customer_sk and\n       cd1.cd_gender = 'M' and \n       cd1.cd_education_status = 'Unknown' and\n       c_current_cdemo_sk = cd2.cd_demo_sk and\n       c_current_addr_sk = ca_address_sk and\n       c_birth_month in (5,1,4,7,8,9) and\n       d_year = 2002 and\n       ca_state in ('AR','TX','NC'\n                   ,'GA','MS','WV','AL')\n group by rollup (i_item_id, ca_country, ca_state, ca_county)\n order by ca_country,\n        ca_state, \n        ca_county,\n\ti_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query18.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query19.sql",
    "content": "-- start query 1 in stream 0 using template query19.tpl and seed 1930872976\nselect  i_brand_id brand_id, i_brand brand, i_manufact_id, i_manufact,\n \tsum(ss_ext_sales_price) ext_price\n from date_dim, store_sales, item,customer,customer_address,store\n where d_date_sk = ss_sold_date_sk\n   and ss_item_sk = i_item_sk\n   and i_manager_id=16\n   and d_moy=12\n   and d_year=1998\n   and ss_customer_sk = c_customer_sk \n   and c_current_addr_sk = ca_address_sk\n   and substr(ca_zip,1,5) <> substr(s_zip,1,5) \n   and ss_store_sk = s_store_sk \n group by i_brand\n      ,i_brand_id\n      ,i_manufact_id\n      ,i_manufact\n order by ext_price desc\n         ,i_brand\n         ,i_brand_id\n         ,i_manufact_id\n         ,i_manufact\nlimit 100 ;\n\n-- end query 1 in stream 0 using template query19.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query2.sql",
    "content": "-- start query 1 in stream 0 using template query2.tpl and seed 1819994127\nwith wscs as\n (select sold_date_sk\n        ,sales_price\n  from (select ws_sold_date_sk sold_date_sk\n              ,ws_ext_sales_price sales_price\n        from web_sales) x \n        union all\n        (select cs_sold_date_sk sold_date_sk\n              ,cs_ext_sales_price sales_price\n        from catalog_sales)),\n wswscs as \n (select d_week_seq,\n        sum(case when (d_day_name='Sunday') then sales_price else null end) sun_sales,\n        sum(case when (d_day_name='Monday') then sales_price else null end) mon_sales,\n        sum(case when (d_day_name='Tuesday') then sales_price else  null end) tue_sales,\n        sum(case when (d_day_name='Wednesday') then sales_price else null end) wed_sales,\n        sum(case when (d_day_name='Thursday') then sales_price else null end) thu_sales,\n        sum(case when (d_day_name='Friday') then sales_price else null end) fri_sales,\n        sum(case when (d_day_name='Saturday') then sales_price else null end) sat_sales\n from wscs\n     ,date_dim\n where d_date_sk = sold_date_sk\n group by d_week_seq)\n select d_week_seq1\n       ,round(sun_sales1/sun_sales2,2)\n       ,round(mon_sales1/mon_sales2,2)\n       ,round(tue_sales1/tue_sales2,2)\n       ,round(wed_sales1/wed_sales2,2)\n       ,round(thu_sales1/thu_sales2,2)\n       ,round(fri_sales1/fri_sales2,2)\n       ,round(sat_sales1/sat_sales2,2)\n from\n (select wswscs.d_week_seq d_week_seq1\n        ,sun_sales sun_sales1\n        ,mon_sales mon_sales1\n        ,tue_sales tue_sales1\n        ,wed_sales wed_sales1\n        ,thu_sales thu_sales1\n        ,fri_sales fri_sales1\n        ,sat_sales sat_sales1\n  from wswscs,date_dim \n  where date_dim.d_week_seq = wswscs.d_week_seq and\n        d_year = 1998) y,\n (select wswscs.d_week_seq d_week_seq2\n        ,sun_sales sun_sales2\n        ,mon_sales mon_sales2\n        ,tue_sales tue_sales2\n        ,wed_sales wed_sales2\n        ,thu_sales thu_sales2\n        ,fri_sales fri_sales2\n        ,sat_sales sat_sales2\n  from wswscs\n      ,date_dim \n  where date_dim.d_week_seq = wswscs.d_week_seq and\n        d_year = 1998+1) z\n where d_week_seq1=d_week_seq2-53\n order by d_week_seq1;\n\n-- end query 1 in stream 0 using template query2.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query20.sql",
    "content": "-- start query 1 in stream 0 using template query20.tpl and seed 345591136\nselect  i_item_id\n       ,i_item_desc \n       ,i_category \n       ,i_class \n       ,i_current_price\n       ,sum(cs_ext_sales_price) as itemrevenue \n       ,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over\n           (partition by i_class) as revenueratio\n from\tcatalog_sales\n     ,item \n     ,date_dim\n where cs_item_sk = i_item_sk \n   and i_category in ('Shoes', 'Electronics', 'Children')\n   and cs_sold_date_sk = d_date_sk\n and d_date between cast('2001-03-14' as date) \n \t\t\t\tand (cast('2001-03-14' as date) + 30 days)\n group by i_item_id\n         ,i_item_desc \n         ,i_category\n         ,i_class\n         ,i_current_price\n order by i_category\n         ,i_class\n         ,i_item_id\n         ,i_item_desc\n         ,revenueratio\nlimit 100;\n\n-- end query 1 in stream 0 using template query20.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query21.sql",
    "content": "-- start query 1 in stream 0 using template query21.tpl and seed 1819994127\nselect  *\n from(select w_warehouse_name\n            ,i_item_id\n            ,sum(case when (cast(d_date as date) < cast ('1999-03-20' as date))\n\t                then inv_quantity_on_hand \n                      else 0 end) as inv_before\n            ,sum(case when (cast(d_date as date) >= cast ('1999-03-20' as date))\n                      then inv_quantity_on_hand \n                      else 0 end) as inv_after\n   from inventory\n       ,warehouse\n       ,item\n       ,date_dim\n   where i_current_price between 0.99 and 1.49\n     and i_item_sk          = inv_item_sk\n     and inv_warehouse_sk   = w_warehouse_sk\n     and inv_date_sk    = d_date_sk\n     and d_date between (cast ('1999-03-20' as date) - 30 days)\n                    and (cast ('1999-03-20' as date) + 30 days)\n   group by w_warehouse_name, i_item_id) x\n where (case when inv_before > 0 \n             then inv_after / inv_before \n             else null\n             end) between 2.0/3.0 and 3.0/2.0\n order by w_warehouse_name\n         ,i_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query21.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query22.sql",
    "content": "-- start query 1 in stream 0 using template query22.tpl and seed 1819994127\nselect  i_product_name\n             ,i_brand\n             ,i_class\n             ,i_category\n             ,avg(inv_quantity_on_hand) qoh\n       from inventory\n           ,date_dim\n           ,item\n       where inv_date_sk=d_date_sk\n              and inv_item_sk=i_item_sk\n              and d_month_seq between 1186 and 1186 + 11\n       group by rollup(i_product_name\n                       ,i_brand\n                       ,i_class\n                       ,i_category)\norder by qoh, i_product_name, i_brand, i_class, i_category\nlimit 100;\n\n-- end query 1 in stream 0 using template query22.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query23.sql",
    "content": "-- start query 1 in stream 0 using template query23.tpl and seed 2031708268\nwith frequent_ss_items as \n (select substr(i_item_desc,1,30) itemdesc,i_item_sk item_sk,d_date solddate,count(*) cnt\n  from store_sales\n      ,date_dim \n      ,item\n  where ss_sold_date_sk = d_date_sk\n    and ss_item_sk = i_item_sk \n    and d_year in (2000,2000+1,2000+2,2000+3)\n  group by substr(i_item_desc,1,30),i_item_sk,d_date\n  having count(*) >4),\n max_store_sales as\n (select max(csales) tpcds_cmax \n  from (select c_customer_sk,sum(ss_quantity*ss_sales_price) csales\n        from store_sales\n            ,customer\n            ,date_dim \n        where ss_customer_sk = c_customer_sk\n         and ss_sold_date_sk = d_date_sk\n         and d_year in (2000,2000+1,2000+2,2000+3) \n        group by c_customer_sk) x),\n best_ss_customer as\n (select c_customer_sk,sum(ss_quantity*ss_sales_price) ssales\n  from store_sales\n      ,customer\n  where ss_customer_sk = c_customer_sk\n  group by c_customer_sk\n  having sum(ss_quantity*ss_sales_price) > (95/100.0) * (select\n  *\nfrom\n max_store_sales))\n  select  sum(sales)\n from (select cs_quantity*cs_list_price sales\n       from catalog_sales\n           ,date_dim \n       where d_year = 2000 \n         and d_moy = 3 \n         and cs_sold_date_sk = d_date_sk \n         and cs_item_sk in (select item_sk from frequent_ss_items)\n         and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer)\n      union all\n      select ws_quantity*ws_list_price sales\n       from web_sales \n           ,date_dim \n       where d_year = 2000 \n         and d_moy = 3 \n         and ws_sold_date_sk = d_date_sk \n         and ws_item_sk in (select item_sk from frequent_ss_items)\n         and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer)) y \n limit 100;\nwith frequent_ss_items as\n (select substr(i_item_desc,1,30) itemdesc,i_item_sk item_sk,d_date solddate,count(*) cnt\n  from store_sales\n      ,date_dim\n      ,item\n  where ss_sold_date_sk = d_date_sk\n    and ss_item_sk = i_item_sk\n    and d_year in (2000,2000 + 1,2000 + 2,2000 + 3)\n  group by substr(i_item_desc,1,30),i_item_sk,d_date\n  having count(*) >4),\n max_store_sales as\n (select max(csales) tpcds_cmax\n  from (select c_customer_sk,sum(ss_quantity*ss_sales_price) csales\n        from store_sales\n            ,customer\n            ,date_dim \n        where ss_customer_sk = c_customer_sk\n         and ss_sold_date_sk = d_date_sk\n         and d_year in (2000,2000+1,2000+2,2000+3)\n        group by c_customer_sk) x),\n best_ss_customer as\n (select c_customer_sk,sum(ss_quantity*ss_sales_price) ssales\n  from store_sales\n      ,customer\n  where ss_customer_sk = c_customer_sk\n  group by c_customer_sk\n  having sum(ss_quantity*ss_sales_price) > (95/100.0) * (select\n  *\n from max_store_sales))\n  select  c_last_name,c_first_name,sales\n from (select c_last_name,c_first_name,sum(cs_quantity*cs_list_price) sales\n        from catalog_sales\n            ,customer\n            ,date_dim \n        where d_year = 2000 \n         and d_moy = 3 \n         and cs_sold_date_sk = d_date_sk \n         and cs_item_sk in (select item_sk from frequent_ss_items)\n         and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer)\n         and cs_bill_customer_sk = c_customer_sk \n       group by c_last_name,c_first_name\n      union all\n      select c_last_name,c_first_name,sum(ws_quantity*ws_list_price) sales\n       from web_sales\n           ,customer\n           ,date_dim \n       where d_year = 2000 \n         and d_moy = 3 \n         and ws_sold_date_sk = d_date_sk \n         and ws_item_sk in (select item_sk from frequent_ss_items)\n         and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer)\n         and ws_bill_customer_sk = c_customer_sk\n       group by c_last_name,c_first_name) y\n     order by c_last_name,c_first_name,sales\n  limit 100;\n\n-- end query 1 in stream 0 using template query23.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query24.sql",
    "content": "-- start query 1 in stream 0 using template query24.tpl and seed 1220860970\nwith ssales as\n(select c_last_name\n      ,c_first_name\n      ,s_store_name\n      ,ca_state\n      ,s_state\n      ,i_color\n      ,i_current_price\n      ,i_manager_id\n      ,i_units\n      ,i_size\n      ,sum(ss_sales_price) netpaid\nfrom store_sales\n    ,store_returns\n    ,store\n    ,item\n    ,customer\n    ,customer_address\nwhere ss_ticket_number = sr_ticket_number\n  and ss_item_sk = sr_item_sk\n  and ss_customer_sk = c_customer_sk\n  and ss_item_sk = i_item_sk\n  and ss_store_sk = s_store_sk\n  and c_current_addr_sk = ca_address_sk\n  and c_birth_country <> upper(ca_country)\n  and s_zip = ca_zip\nand s_market_id=10\ngroup by c_last_name\n        ,c_first_name\n        ,s_store_name\n        ,ca_state\n        ,s_state\n        ,i_color\n        ,i_current_price\n        ,i_manager_id\n        ,i_units\n        ,i_size)\nselect c_last_name\n      ,c_first_name\n      ,s_store_name\n      ,sum(netpaid) paid\nfrom ssales\nwhere i_color = 'snow'\ngroup by c_last_name\n        ,c_first_name\n        ,s_store_name\nhaving sum(netpaid) > (select 0.05*avg(netpaid)\n                                 from ssales)\norder by c_last_name\n        ,c_first_name\n        ,s_store_name\n;\nwith ssales as\n(select c_last_name\n      ,c_first_name\n      ,s_store_name\n      ,ca_state\n      ,s_state\n      ,i_color\n      ,i_current_price\n      ,i_manager_id\n      ,i_units\n      ,i_size\n      ,sum(ss_sales_price) netpaid\nfrom store_sales\n    ,store_returns\n    ,store\n    ,item\n    ,customer\n    ,customer_address\nwhere ss_ticket_number = sr_ticket_number\n  and ss_item_sk = sr_item_sk\n  and ss_customer_sk = c_customer_sk\n  and ss_item_sk = i_item_sk\n  and ss_store_sk = s_store_sk\n  and c_current_addr_sk = ca_address_sk\n  and c_birth_country <> upper(ca_country)\n  and s_zip = ca_zip\n  and s_market_id = 10\ngroup by c_last_name\n        ,c_first_name\n        ,s_store_name\n        ,ca_state\n        ,s_state\n        ,i_color\n        ,i_current_price\n        ,i_manager_id\n        ,i_units\n        ,i_size)\nselect c_last_name\n      ,c_first_name\n      ,s_store_name\n      ,sum(netpaid) paid\nfrom ssales\nwhere i_color = 'chiffon'\ngroup by c_last_name\n        ,c_first_name\n        ,s_store_name\nhaving sum(netpaid) > (select 0.05*avg(netpaid)\n                           from ssales)\norder by c_last_name\n        ,c_first_name\n        ,s_store_name\n;\n\n-- end query 1 in stream 0 using template query24.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query25.sql",
    "content": "-- start query 1 in stream 0 using template query25.tpl and seed 1819994127\nselect  \n i_item_id\n ,i_item_desc\n ,s_store_id\n ,s_store_name\n ,sum(ss_net_profit) as store_sales_profit\n ,sum(sr_net_loss) as store_returns_loss\n ,sum(cs_net_profit) as catalog_sales_profit\n from\n store_sales\n ,store_returns\n ,catalog_sales\n ,date_dim d1\n ,date_dim d2\n ,date_dim d3\n ,store\n ,item\n where\n d1.d_moy = 4\n and d1.d_year = 2000\n and d1.d_date_sk = ss_sold_date_sk\n and i_item_sk = ss_item_sk\n and s_store_sk = ss_store_sk\n and ss_customer_sk = sr_customer_sk\n and ss_item_sk = sr_item_sk\n and ss_ticket_number = sr_ticket_number\n and sr_returned_date_sk = d2.d_date_sk\n and d2.d_moy               between 4 and  10\n and d2.d_year              = 2000\n and sr_customer_sk = cs_bill_customer_sk\n and sr_item_sk = cs_item_sk\n and cs_sold_date_sk = d3.d_date_sk\n and d3.d_moy               between 4 and  10 \n and d3.d_year              = 2000\n group by\n i_item_id\n ,i_item_desc\n ,s_store_id\n ,s_store_name\n order by\n i_item_id\n ,i_item_desc\n ,s_store_id\n ,s_store_name\n limit 100;\n\n-- end query 1 in stream 0 using template query25.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query26.sql",
    "content": "-- start query 1 in stream 0 using template query26.tpl and seed 1930872976\nselect  i_item_id, \n        avg(cs_quantity) agg1,\n        avg(cs_list_price) agg2,\n        avg(cs_coupon_amt) agg3,\n        avg(cs_sales_price) agg4 \n from catalog_sales, customer_demographics, date_dim, item, promotion\n where cs_sold_date_sk = d_date_sk and\n       cs_item_sk = i_item_sk and\n       cs_bill_cdemo_sk = cd_demo_sk and\n       cs_promo_sk = p_promo_sk and\n       cd_gender = 'F' and \n       cd_marital_status = 'S' and\n       cd_education_status = 'College' and\n       (p_channel_email = 'N' or p_channel_event = 'N') and\n       d_year = 1998 \n group by i_item_id\n order by i_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query26.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query27.sql",
    "content": "-- start query 1 in stream 0 using template query27.tpl and seed 2017787633\nselect  i_item_id,\n        s_state, grouping(s_state) g_state,\n        avg(ss_quantity) agg1,\n        avg(ss_list_price) agg2,\n        avg(ss_coupon_amt) agg3,\n        avg(ss_sales_price) agg4\n from store_sales, customer_demographics, date_dim, store, item\n where ss_sold_date_sk = d_date_sk and\n       ss_item_sk = i_item_sk and\n       ss_store_sk = s_store_sk and\n       ss_cdemo_sk = cd_demo_sk and\n       cd_gender = 'F' and\n       cd_marital_status = 'U' and\n       cd_education_status = '2 yr Degree' and\n       d_year = 2000 and\n       s_state in ('AL','IN', 'SC', 'NY', 'OH', 'FL')\n group by rollup (i_item_id, s_state)\n order by i_item_id\n         ,s_state\n limit 100;\n\n-- end query 1 in stream 0 using template query27.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query28.sql",
    "content": "-- start query 1 in stream 0 using template query28.tpl and seed 444293455\nselect  *\nfrom (select avg(ss_list_price) B1_LP\n            ,count(ss_list_price) B1_CNT\n            ,count(distinct ss_list_price) B1_CNTD\n      from store_sales\n      where ss_quantity between 0 and 5\n        and (ss_list_price between 73 and 73+10 \n             or ss_coupon_amt between 7826 and 7826+1000\n             or ss_wholesale_cost between 70 and 70+20)) B1,\n     (select avg(ss_list_price) B2_LP\n            ,count(ss_list_price) B2_CNT\n            ,count(distinct ss_list_price) B2_CNTD\n      from store_sales\n      where ss_quantity between 6 and 10\n        and (ss_list_price between 152 and 152+10\n          or ss_coupon_amt between 2196 and 2196+1000\n          or ss_wholesale_cost between 56 and 56+20)) B2,\n     (select avg(ss_list_price) B3_LP\n            ,count(ss_list_price) B3_CNT\n            ,count(distinct ss_list_price) B3_CNTD\n      from store_sales\n      where ss_quantity between 11 and 15\n        and (ss_list_price between 53 and 53+10\n          or ss_coupon_amt between 3430 and 3430+1000\n          or ss_wholesale_cost between 13 and 13+20)) B3,\n     (select avg(ss_list_price) B4_LP\n            ,count(ss_list_price) B4_CNT\n            ,count(distinct ss_list_price) B4_CNTD\n      from store_sales\n      where ss_quantity between 16 and 20\n        and (ss_list_price between 182 and 182+10\n          or ss_coupon_amt between 3262 and 3262+1000\n          or ss_wholesale_cost between 20 and 20+20)) B4,\n     (select avg(ss_list_price) B5_LP\n            ,count(ss_list_price) B5_CNT\n            ,count(distinct ss_list_price) B5_CNTD\n      from store_sales\n      where ss_quantity between 21 and 25\n        and (ss_list_price between 85 and 85+10\n          or ss_coupon_amt between 3310 and 3310+1000\n          or ss_wholesale_cost between 37 and 37+20)) B5,\n     (select avg(ss_list_price) B6_LP\n            ,count(ss_list_price) B6_CNT\n            ,count(distinct ss_list_price) B6_CNTD\n      from store_sales\n      where ss_quantity between 26 and 30\n        and (ss_list_price between 180 and 180+10\n          or ss_coupon_amt between 12592 and 12592+1000\n          or ss_wholesale_cost between 22 and 22+20)) B6\nlimit 100;\n\n-- end query 1 in stream 0 using template query28.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query29.sql",
    "content": "-- start query 1 in stream 0 using template query29.tpl and seed 2031708268\nselect   \n     i_item_id\n    ,i_item_desc\n    ,s_store_id\n    ,s_store_name\n    ,stddev_samp(ss_quantity)        as store_sales_quantity\n    ,stddev_samp(sr_return_quantity) as store_returns_quantity\n    ,stddev_samp(cs_quantity)        as catalog_sales_quantity\n from\n    store_sales\n   ,store_returns\n   ,catalog_sales\n   ,date_dim             d1\n   ,date_dim             d2\n   ,date_dim             d3\n   ,store\n   ,item\n where\n     d1.d_moy               = 4 \n and d1.d_year              = 1998\n and d1.d_date_sk           = ss_sold_date_sk\n and i_item_sk              = ss_item_sk\n and s_store_sk             = ss_store_sk\n and ss_customer_sk         = sr_customer_sk\n and ss_item_sk             = sr_item_sk\n and ss_ticket_number       = sr_ticket_number\n and sr_returned_date_sk    = d2.d_date_sk\n and d2.d_moy               between 4 and  4 + 3 \n and d2.d_year              = 1998\n and sr_customer_sk         = cs_bill_customer_sk\n and sr_item_sk             = cs_item_sk\n and cs_sold_date_sk        = d3.d_date_sk     \n and d3.d_year              in (1998,1998+1,1998+2)\n group by\n    i_item_id\n   ,i_item_desc\n   ,s_store_id\n   ,s_store_name\n order by\n    i_item_id \n   ,i_item_desc\n   ,s_store_id\n   ,s_store_name\n limit 100;\n\n-- end query 1 in stream 0 using template query29.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query3.sql",
    "content": "-- start query 1 in stream 0 using template query3.tpl and seed 2031708268\nselect  dt.d_year \n       ,item.i_brand_id brand_id \n       ,item.i_brand brand\n       ,sum(ss_sales_price) sum_agg\n from  date_dim dt \n      ,store_sales\n      ,item\n where dt.d_date_sk = store_sales.ss_sold_date_sk\n   and store_sales.ss_item_sk = item.i_item_sk\n   and item.i_manufact_id = 816\n   and dt.d_moy=11\n group by dt.d_year\n      ,item.i_brand\n      ,item.i_brand_id\n order by dt.d_year\n         ,sum_agg desc\n         ,brand_id\n limit 100;\n\n-- end query 1 in stream 0 using template query3.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query30.sql",
    "content": "-- start query 1 in stream 0 using template query30.tpl and seed 1819994127\nwith customer_total_return as\n (select wr_returning_customer_sk as ctr_customer_sk\n        ,ca_state as ctr_state, \n \tsum(wr_return_amt) as ctr_total_return\n from web_returns\n     ,date_dim\n     ,customer_address\n where wr_returned_date_sk = d_date_sk \n   and d_year =2000\n   and wr_returning_addr_sk = ca_address_sk \n group by wr_returning_customer_sk\n         ,ca_state)\n  select  c_customer_id,c_salutation,c_first_name,c_last_name,c_preferred_cust_flag\n       ,c_birth_day,c_birth_month,c_birth_year,c_birth_country,c_login,c_email_address\n       ,c_last_review_date_sk,ctr_total_return\n from customer_total_return ctr1\n     ,customer_address\n     ,customer\n where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2\n \t\t\t  from customer_total_return ctr2 \n                  \t  where ctr1.ctr_state = ctr2.ctr_state)\n       and ca_address_sk = c_current_addr_sk\n       and ca_state = 'GA'\n       and ctr1.ctr_customer_sk = c_customer_sk\n order by c_customer_id,c_salutation,c_first_name,c_last_name,c_preferred_cust_flag\n                  ,c_birth_day,c_birth_month,c_birth_year,c_birth_country,c_login,c_email_address\n                  ,c_last_review_date_sk,ctr_total_return\nlimit 100;\n\n-- end query 1 in stream 0 using template query30.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query31.sql",
    "content": "-- start query 1 in stream 0 using template query31.tpl and seed 1819994127\nwith ss as\n (select ca_county,d_qoy, d_year,sum(ss_ext_sales_price) as store_sales\n from store_sales,date_dim,customer_address\n where ss_sold_date_sk = d_date_sk\n  and ss_addr_sk=ca_address_sk\n group by ca_county,d_qoy, d_year),\n ws as\n (select ca_county,d_qoy, d_year,sum(ws_ext_sales_price) as web_sales\n from web_sales,date_dim,customer_address\n where ws_sold_date_sk = d_date_sk\n  and ws_bill_addr_sk=ca_address_sk\n group by ca_county,d_qoy, d_year)\n select \n        ss1.ca_county\n       ,ss1.d_year\n       ,ws2.web_sales/ws1.web_sales web_q1_q2_increase\n       ,ss2.store_sales/ss1.store_sales store_q1_q2_increase\n       ,ws3.web_sales/ws2.web_sales web_q2_q3_increase\n       ,ss3.store_sales/ss2.store_sales store_q2_q3_increase\n from\n        ss ss1\n       ,ss ss2\n       ,ss ss3\n       ,ws ws1\n       ,ws ws2\n       ,ws ws3\n where\n    ss1.d_qoy = 1\n    and ss1.d_year = 1999\n    and ss1.ca_county = ss2.ca_county\n    and ss2.d_qoy = 2\n    and ss2.d_year = 1999\n and ss2.ca_county = ss3.ca_county\n    and ss3.d_qoy = 3\n    and ss3.d_year = 1999\n    and ss1.ca_county = ws1.ca_county\n    and ws1.d_qoy = 1\n    and ws1.d_year = 1999\n    and ws1.ca_county = ws2.ca_county\n    and ws2.d_qoy = 2\n    and ws2.d_year = 1999\n    and ws1.ca_county = ws3.ca_county\n    and ws3.d_qoy = 3\n    and ws3.d_year =1999\n    and case when ws1.web_sales > 0 then ws2.web_sales/ws1.web_sales else null end \n       > case when ss1.store_sales > 0 then ss2.store_sales/ss1.store_sales else null end\n    and case when ws2.web_sales > 0 then ws3.web_sales/ws2.web_sales else null end\n       > case when ss2.store_sales > 0 then ss3.store_sales/ss2.store_sales else null end\n order by ss1.d_year;\n\n-- end query 1 in stream 0 using template query31.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query32.sql",
    "content": "-- start query 1 in stream 0 using template query32.tpl and seed 2031708268\nselect  sum(cs_ext_discount_amt)  as `excess discount amount` \nfrom \n   catalog_sales \n   ,item \n   ,date_dim\nwhere\ni_manufact_id = 66\nand i_item_sk = cs_item_sk \nand d_date between '2002-03-29' and \n        (cast('2002-03-29' as date) + 90 days)\nand d_date_sk = cs_sold_date_sk \nand cs_ext_discount_amt  \n     > ( \n         select \n            1.3 * avg(cs_ext_discount_amt) \n         from \n            catalog_sales \n           ,date_dim\n         where \n              cs_item_sk = i_item_sk \n          and d_date between '2002-03-29' and\n                             (cast('2002-03-29' as date) + 90 days)\n          and d_date_sk = cs_sold_date_sk \n      ) \nlimit 100;\n\n-- end query 1 in stream 0 using template query32.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query33.sql",
    "content": "-- start query 1 in stream 0 using template query33.tpl and seed 1930872976\nwith ss as (\n select\n          i_manufact_id,sum(ss_ext_sales_price) total_sales\n from\n \tstore_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_manufact_id in (select\n  i_manufact_id\nfrom\n item\nwhere i_category in ('Home'))\n and     ss_item_sk              = i_item_sk\n and     ss_sold_date_sk         = d_date_sk\n and     d_year                  = 1998\n and     d_moy                   = 5\n and     ss_addr_sk              = ca_address_sk\n and     ca_gmt_offset           = -6 \n group by i_manufact_id),\n cs as (\n select\n          i_manufact_id,sum(cs_ext_sales_price) total_sales\n from\n \tcatalog_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_manufact_id               in (select\n  i_manufact_id\nfrom\n item\nwhere i_category in ('Home'))\n and     cs_item_sk              = i_item_sk\n and     cs_sold_date_sk         = d_date_sk\n and     d_year                  = 1998\n and     d_moy                   = 5\n and     cs_bill_addr_sk         = ca_address_sk\n and     ca_gmt_offset           = -6 \n group by i_manufact_id),\n ws as (\n select\n          i_manufact_id,sum(ws_ext_sales_price) total_sales\n from\n \tweb_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_manufact_id               in (select\n  i_manufact_id\nfrom\n item\nwhere i_category in ('Home'))\n and     ws_item_sk              = i_item_sk\n and     ws_sold_date_sk         = d_date_sk\n and     d_year                  = 1998\n and     d_moy                   = 5\n and     ws_bill_addr_sk         = ca_address_sk\n and     ca_gmt_offset           = -6\n group by i_manufact_id)\n  select  i_manufact_id ,sum(total_sales) total_sales\n from  (select * from ss \n        union all\n        select * from cs \n        union all\n        select * from ws) tmp1\n group by i_manufact_id\n order by total_sales\nlimit 100;\n\n-- end query 1 in stream 0 using template query33.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query34.sql",
    "content": "-- start query 1 in stream 0 using template query34.tpl and seed 1971067816\nselect c_last_name\n       ,c_first_name\n       ,c_salutation\n       ,c_preferred_cust_flag\n       ,ss_ticket_number\n       ,cnt from\n   (select ss_ticket_number\n          ,ss_customer_sk\n          ,count(*) cnt\n    from store_sales,date_dim,store,household_demographics\n    where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    and store_sales.ss_store_sk = store.s_store_sk  \n    and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    and (date_dim.d_dom between 1 and 3 or date_dim.d_dom between 25 and 28)\n    and (household_demographics.hd_buy_potential = '>10000' or\n         household_demographics.hd_buy_potential = 'Unknown')\n    and household_demographics.hd_vehicle_count > 0\n    and (case when household_demographics.hd_vehicle_count > 0 \n\tthen household_demographics.hd_dep_count/ household_demographics.hd_vehicle_count \n\telse null \n\tend)  > 1.2\n    and date_dim.d_year in (2000,2000+1,2000+2)\n    and store.s_county in ('Salem County','Terrell County','Arthur County','Oglethorpe County',\n                           'Lunenburg County','Perry County','Halifax County','Sumner County')\n    group by ss_ticket_number,ss_customer_sk) dn,customer\n    where ss_customer_sk = c_customer_sk\n      and cnt between 15 and 20\n    order by c_last_name,c_first_name,c_salutation,c_preferred_cust_flag desc, ss_ticket_number;\n\n-- end query 1 in stream 0 using template query34.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query35.sql",
    "content": "-- start query 1 in stream 0 using template query35.tpl and seed 1930872976\nselect   \n  ca_state,\n  cd_gender,\n  cd_marital_status,\n  cd_dep_count,\n  count(*) cnt1,\n  avg(cd_dep_count),\n  min(cd_dep_count),\n  stddev_samp(cd_dep_count),\n  cd_dep_employed_count,\n  count(*) cnt2,\n  avg(cd_dep_employed_count),\n  min(cd_dep_employed_count),\n  stddev_samp(cd_dep_employed_count),\n  cd_dep_college_count,\n  count(*) cnt3,\n  avg(cd_dep_college_count),\n  min(cd_dep_college_count),\n  stddev_samp(cd_dep_college_count)\n from\n  customer c,customer_address ca,customer_demographics\n where\n  c.c_current_addr_sk = ca.ca_address_sk and\n  cd_demo_sk = c.c_current_cdemo_sk and \n  exists (select *\n          from store_sales,date_dim\n          where c.c_customer_sk = ss_customer_sk and\n                ss_sold_date_sk = d_date_sk and\n                d_year = 2001 and\n                d_qoy < 4) and\n   (exists (select *\n            from web_sales,date_dim\n            where c.c_customer_sk = ws_bill_customer_sk and\n                  ws_sold_date_sk = d_date_sk and\n                  d_year = 2001 and\n                  d_qoy < 4) or \n    exists (select * \n            from catalog_sales,date_dim\n            where c.c_customer_sk = cs_ship_customer_sk and\n                  cs_sold_date_sk = d_date_sk and\n                  d_year = 2001 and\n                  d_qoy < 4))\n group by ca_state,\n          cd_gender,\n          cd_marital_status,\n          cd_dep_count,\n          cd_dep_employed_count,\n          cd_dep_college_count\n order by ca_state,\n          cd_gender,\n          cd_marital_status,\n          cd_dep_count,\n          cd_dep_employed_count,\n          cd_dep_college_count\n limit 100;\n\n-- end query 1 in stream 0 using template query35.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query36.sql",
    "content": "-- start query 1 in stream 0 using template query36.tpl and seed 1544728811\nselect  \n    sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin\n   ,i_category\n   ,i_class\n   ,grouping(i_category)+grouping(i_class) as lochierarchy\n   ,rank() over (\n \tpartition by grouping(i_category)+grouping(i_class),\n \tcase when grouping(i_class) = 0 then i_category end \n \torder by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as rank_within_parent\n from\n    store_sales\n   ,date_dim       d1\n   ,item\n   ,store\n where\n    d1.d_year = 1999 \n and d1.d_date_sk = ss_sold_date_sk\n and i_item_sk  = ss_item_sk \n and s_store_sk  = ss_store_sk\n and s_state in ('IN','AL','MI','MN',\n                 'TN','LA','FL','NM')\n group by rollup(i_category,i_class)\n order by\n   lochierarchy desc\n  ,case when lochierarchy = 0 then i_category end\n  ,rank_within_parent\n  limit 100;\n\n-- end query 1 in stream 0 using template query36.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query37.sql",
    "content": "-- start query 1 in stream 0 using template query37.tpl and seed 301843662\nselect  i_item_id\n       ,i_item_desc\n       ,i_current_price\n from item, inventory, date_dim, catalog_sales\n where i_current_price between 39 and 39 + 30\n and inv_item_sk = i_item_sk\n and d_date_sk=inv_date_sk\n and d_date between cast('2001-01-16' as date) and (cast('2001-01-16' as date) +  60 days)\n and i_manufact_id in (765,886,889,728)\n and inv_quantity_on_hand between 100 and 500\n and cs_item_sk = i_item_sk\n group by i_item_id,i_item_desc,i_current_price\n order by i_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query37.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query38.sql",
    "content": "-- start query 1 in stream 0 using template query38.tpl and seed 1819994127\nselect  count(*) from (\n    select distinct c_last_name, c_first_name, d_date\n    from store_sales, date_dim, customer\n          where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n      and store_sales.ss_customer_sk = customer.c_customer_sk\n      and d_month_seq between 1186 and 1186 + 11\n  intersect\n    select distinct c_last_name, c_first_name, d_date\n    from catalog_sales, date_dim, customer\n          where catalog_sales.cs_sold_date_sk = date_dim.d_date_sk\n      and catalog_sales.cs_bill_customer_sk = customer.c_customer_sk\n      and d_month_seq between 1186 and 1186 + 11\n  intersect\n    select distinct c_last_name, c_first_name, d_date\n    from web_sales, date_dim, customer\n          where web_sales.ws_sold_date_sk = date_dim.d_date_sk\n      and web_sales.ws_bill_customer_sk = customer.c_customer_sk\n      and d_month_seq between 1186 and 1186 + 11\n) hot_cust\nlimit 100;\n\n-- end query 1 in stream 0 using template query38.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query39.sql",
    "content": "-- start query 1 in stream 0 using template query39.tpl and seed 1327317894\nwith inv as\n(select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy\n       ,stdev,mean, case mean when 0 then null else stdev/mean end cov\n from(select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy\n            ,stddev_samp(inv_quantity_on_hand) stdev,avg(inv_quantity_on_hand) mean\n      from inventory\n          ,item\n          ,warehouse\n          ,date_dim\n      where inv_item_sk = i_item_sk\n        and inv_warehouse_sk = w_warehouse_sk\n        and inv_date_sk = d_date_sk\n        and d_year =2000\n      group by w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy) foo\n where case mean when 0 then 0 else stdev/mean end > 1)\nselect inv1.w_warehouse_sk,inv1.i_item_sk,inv1.d_moy,inv1.mean, inv1.cov\n        ,inv2.w_warehouse_sk,inv2.i_item_sk,inv2.d_moy,inv2.mean, inv2.cov\nfrom inv inv1,inv inv2\nwhere inv1.i_item_sk = inv2.i_item_sk\n  and inv1.w_warehouse_sk =  inv2.w_warehouse_sk\n  and inv1.d_moy=2\n  and inv2.d_moy=2+1\norder by inv1.w_warehouse_sk,inv1.i_item_sk,inv1.d_moy,inv1.mean,inv1.cov\n        ,inv2.d_moy,inv2.mean, inv2.cov\n;\nwith inv as\n(select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy\n       ,stdev,mean, case mean when 0 then null else stdev/mean end cov\n from(select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy\n            ,stddev_samp(inv_quantity_on_hand) stdev,avg(inv_quantity_on_hand) mean\n      from inventory\n          ,item\n          ,warehouse\n          ,date_dim\n      where inv_item_sk = i_item_sk\n        and inv_warehouse_sk = w_warehouse_sk\n        and inv_date_sk = d_date_sk\n        and d_year =2000\n      group by w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy) foo\n where case mean when 0 then 0 else stdev/mean end > 1)\nselect inv1.w_warehouse_sk,inv1.i_item_sk,inv1.d_moy,inv1.mean, inv1.cov\n        ,inv2.w_warehouse_sk,inv2.i_item_sk,inv2.d_moy,inv2.mean, inv2.cov\nfrom inv inv1,inv inv2\nwhere inv1.i_item_sk = inv2.i_item_sk\n  and inv1.w_warehouse_sk =  inv2.w_warehouse_sk\n  and inv1.d_moy=2\n  and inv2.d_moy=2+1\n  and inv1.cov > 1.5\norder by inv1.w_warehouse_sk,inv1.i_item_sk,inv1.d_moy,inv1.mean,inv1.cov\n        ,inv2.d_moy,inv2.mean, inv2.cov\n;\n\n-- end query 1 in stream 0 using template query39.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query4.sql",
    "content": "-- start query 1 in stream 0 using template query4.tpl and seed 1819994127\nwith year_total as (\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,c_preferred_cust_flag customer_preferred_cust_flag\n       ,c_birth_country customer_birth_country\n       ,c_login customer_login\n       ,c_email_address customer_email_address\n       ,d_year dyear\n       ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2) year_total\n       ,'s' sale_type\n from customer\n     ,store_sales\n     ,date_dim\n where c_customer_sk = ss_customer_sk\n   and ss_sold_date_sk = d_date_sk\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,c_preferred_cust_flag\n         ,c_birth_country\n         ,c_login\n         ,c_email_address\n         ,d_year\n union all\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,c_preferred_cust_flag customer_preferred_cust_flag\n       ,c_birth_country customer_birth_country\n       ,c_login customer_login\n       ,c_email_address customer_email_address\n       ,d_year dyear\n       ,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2) ) year_total\n       ,'c' sale_type\n from customer\n     ,catalog_sales\n     ,date_dim\n where c_customer_sk = cs_bill_customer_sk\n   and cs_sold_date_sk = d_date_sk\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,c_preferred_cust_flag\n         ,c_birth_country\n         ,c_login\n         ,c_email_address\n         ,d_year\nunion all\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,c_preferred_cust_flag customer_preferred_cust_flag\n       ,c_birth_country customer_birth_country\n       ,c_login customer_login\n       ,c_email_address customer_email_address\n       ,d_year dyear\n       ,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2) ) year_total\n       ,'w' sale_type\n from customer\n     ,web_sales\n     ,date_dim\n where c_customer_sk = ws_bill_customer_sk\n   and ws_sold_date_sk = d_date_sk\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,c_preferred_cust_flag\n         ,c_birth_country\n         ,c_login\n         ,c_email_address\n         ,d_year\n         )\n  select  \n                  t_s_secyear.customer_id\n                 ,t_s_secyear.customer_first_name\n                 ,t_s_secyear.customer_last_name\n                 ,t_s_secyear.customer_birth_country\n from year_total t_s_firstyear\n     ,year_total t_s_secyear\n     ,year_total t_c_firstyear\n     ,year_total t_c_secyear\n     ,year_total t_w_firstyear\n     ,year_total t_w_secyear\n where t_s_secyear.customer_id = t_s_firstyear.customer_id\n   and t_s_firstyear.customer_id = t_c_secyear.customer_id\n   and t_s_firstyear.customer_id = t_c_firstyear.customer_id\n   and t_s_firstyear.customer_id = t_w_firstyear.customer_id\n   and t_s_firstyear.customer_id = t_w_secyear.customer_id\n   and t_s_firstyear.sale_type = 's'\n   and t_c_firstyear.sale_type = 'c'\n   and t_w_firstyear.sale_type = 'w'\n   and t_s_secyear.sale_type = 's'\n   and t_c_secyear.sale_type = 'c'\n   and t_w_secyear.sale_type = 'w'\n   and t_s_firstyear.dyear =  1999\n   and t_s_secyear.dyear = 1999+1\n   and t_c_firstyear.dyear =  1999\n   and t_c_secyear.dyear =  1999+1\n   and t_w_firstyear.dyear = 1999\n   and t_w_secyear.dyear = 1999+1\n   and t_s_firstyear.year_total > 0\n   and t_c_firstyear.year_total > 0\n   and t_w_firstyear.year_total > 0\n   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end\n           > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end\n   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end\n           > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end\n order by t_s_secyear.customer_id\n         ,t_s_secyear.customer_first_name\n         ,t_s_secyear.customer_last_name\n         ,t_s_secyear.customer_birth_country\nlimit 100;\n\n-- end query 1 in stream 0 using template query4.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query40.sql",
    "content": "-- start query 1 in stream 0 using template query40.tpl and seed 1819994127\nselect  \n   w_state\n  ,i_item_id\n  ,sum(case when (cast(d_date as date) < cast ('2000-03-18' as date)) \n \t\tthen cs_sales_price - coalesce(cr_refunded_cash,0) else 0 end) as sales_before\n  ,sum(case when (cast(d_date as date) >= cast ('2000-03-18' as date)) \n \t\tthen cs_sales_price - coalesce(cr_refunded_cash,0) else 0 end) as sales_after\n from\n   catalog_sales left outer join catalog_returns on\n       (cs_order_number = cr_order_number \n        and cs_item_sk = cr_item_sk)\n  ,warehouse \n  ,item\n  ,date_dim\n where\n     i_current_price between 0.99 and 1.49\n and i_item_sk          = cs_item_sk\n and cs_warehouse_sk    = w_warehouse_sk \n and cs_sold_date_sk    = d_date_sk\n and d_date between (cast ('2000-03-18' as date) - 30 days)\n                and (cast ('2000-03-18' as date) + 30 days) \n group by\n    w_state,i_item_id\n order by w_state,i_item_id\nlimit 100;\n\n-- end query 1 in stream 0 using template query40.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query41.sql",
    "content": "-- start query 1 in stream 0 using template query41.tpl and seed 1581015815\nselect  distinct(i_product_name)\n from item i1\n where i_manufact_id between 970 and 970+40 \n   and (select count(*) as item_cnt\n        from item\n        where (i_manufact = i1.i_manufact and\n        ((i_category = 'Women' and \n        (i_color = 'frosted' or i_color = 'rose') and \n        (i_units = 'Lb' or i_units = 'Gross') and\n        (i_size = 'medium' or i_size = 'large')\n        ) or\n        (i_category = 'Women' and\n        (i_color = 'chocolate' or i_color = 'black') and\n        (i_units = 'Box' or i_units = 'Dram') and\n        (i_size = 'economy' or i_size = 'petite')\n        ) or\n        (i_category = 'Men' and\n        (i_color = 'slate' or i_color = 'magenta') and\n        (i_units = 'Carton' or i_units = 'Bundle') and\n        (i_size = 'N/A' or i_size = 'small')\n        ) or\n        (i_category = 'Men' and\n        (i_color = 'cornflower' or i_color = 'firebrick') and\n        (i_units = 'Pound' or i_units = 'Oz') and\n        (i_size = 'medium' or i_size = 'large')\n        ))) or\n       (i_manufact = i1.i_manufact and\n        ((i_category = 'Women' and \n        (i_color = 'almond' or i_color = 'steel') and \n        (i_units = 'Tsp' or i_units = 'Case') and\n        (i_size = 'medium' or i_size = 'large')\n        ) or\n        (i_category = 'Women' and\n        (i_color = 'purple' or i_color = 'aquamarine') and\n        (i_units = 'Bunch' or i_units = 'Gram') and\n        (i_size = 'economy' or i_size = 'petite')\n        ) or\n        (i_category = 'Men' and\n        (i_color = 'lavender' or i_color = 'papaya') and\n        (i_units = 'Pallet' or i_units = 'Cup') and\n        (i_size = 'N/A' or i_size = 'small')\n        ) or\n        (i_category = 'Men' and\n        (i_color = 'maroon' or i_color = 'cyan') and\n        (i_units = 'Each' or i_units = 'N/A') and\n        (i_size = 'medium' or i_size = 'large')\n        )))) > 0\n order by i_product_name\n limit 100;\n\n-- end query 1 in stream 0 using template query41.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query42.sql",
    "content": "-- start query 1 in stream 0 using template query42.tpl and seed 1819994127\nselect  dt.d_year\n \t,item.i_category_id\n \t,item.i_category\n \t,sum(ss_ext_sales_price)\n from \tdate_dim dt\n \t,store_sales\n \t,item\n where dt.d_date_sk = store_sales.ss_sold_date_sk\n \tand store_sales.ss_item_sk = item.i_item_sk\n \tand item.i_manager_id = 1  \t\n \tand dt.d_moy=12\n \tand dt.d_year=1998\n group by \tdt.d_year\n \t\t,item.i_category_id\n \t\t,item.i_category\n order by       sum(ss_ext_sales_price) desc,dt.d_year\n \t\t,item.i_category_id\n \t\t,item.i_category\nlimit 100 ;\n\n-- end query 1 in stream 0 using template query42.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query43.sql",
    "content": "-- start query 1 in stream 0 using template query43.tpl and seed 1819994127\nselect  s_store_name, s_store_id,\n        sum(case when (d_day_name='Sunday') then ss_sales_price else null end) sun_sales,\n        sum(case when (d_day_name='Monday') then ss_sales_price else null end) mon_sales,\n        sum(case when (d_day_name='Tuesday') then ss_sales_price else  null end) tue_sales,\n        sum(case when (d_day_name='Wednesday') then ss_sales_price else null end) wed_sales,\n        sum(case when (d_day_name='Thursday') then ss_sales_price else null end) thu_sales,\n        sum(case when (d_day_name='Friday') then ss_sales_price else null end) fri_sales,\n        sum(case when (d_day_name='Saturday') then ss_sales_price else null end) sat_sales\n from date_dim, store_sales, store\n where d_date_sk = ss_sold_date_sk and\n       s_store_sk = ss_store_sk and\n       s_gmt_offset = -6 and\n       d_year = 2001 \n group by s_store_name, s_store_id\n order by s_store_name, s_store_id,sun_sales,mon_sales,tue_sales,wed_sales,thu_sales,fri_sales,sat_sales\n limit 100;\n\n-- end query 1 in stream 0 using template query43.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query44.sql",
    "content": "-- start query 1 in stream 0 using template query44.tpl and seed 1819994127\nselect  asceding.rnk, i1.i_product_name best_performing, i2.i_product_name worst_performing\nfrom(select *\n     from (select item_sk,rank() over (order by rank_col asc) rnk\n           from (select ss_item_sk item_sk,avg(ss_net_profit) rank_col \n                 from store_sales ss1\n                 where ss_store_sk = 366\n                 group by ss_item_sk\n                 having avg(ss_net_profit) > 0.9*(select avg(ss_net_profit) rank_col\n                                                  from store_sales\n                                                  where ss_store_sk = 366\n                                                    and ss_cdemo_sk is null\n                                                  group by ss_store_sk))V1)V11\n     where rnk  < 11) asceding,\n    (select *\n     from (select item_sk,rank() over (order by rank_col desc) rnk\n           from (select ss_item_sk item_sk,avg(ss_net_profit) rank_col\n                 from store_sales ss1\n                 where ss_store_sk = 366\n                 group by ss_item_sk\n                 having avg(ss_net_profit) > 0.9*(select avg(ss_net_profit) rank_col\n                                                  from store_sales\n                                                  where ss_store_sk = 366\n                                                    and ss_cdemo_sk is null\n                                                  group by ss_store_sk))V2)V21\n     where rnk  < 11) descending,\nitem i1,\nitem i2\nwhere asceding.rnk = descending.rnk \n  and i1.i_item_sk=asceding.item_sk\n  and i2.i_item_sk=descending.item_sk\norder by asceding.rnk\nlimit 100;\n\n-- end query 1 in stream 0 using template query44.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query45.sql",
    "content": "-- start query 1 in stream 0 using template query45.tpl and seed 2031708268\nselect  ca_zip, ca_county, sum(ws_sales_price)\n from web_sales, customer, customer_address, date_dim, item\n where ws_bill_customer_sk = c_customer_sk\n \tand c_current_addr_sk = ca_address_sk \n \tand ws_item_sk = i_item_sk \n \tand ( substr(ca_zip,1,5) in ('85669', '86197','88274','83405','86475', '85392', '85460', '80348', '81792')\n \t      or \n \t      i_item_id in (select i_item_id\n                             from item\n                             where i_item_sk in (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)\n                             )\n \t    )\n \tand ws_sold_date_sk = d_date_sk\n \tand d_qoy = 1 and d_year = 1998\n group by ca_zip, ca_county\n order by ca_zip, ca_county\n limit 100;\n\n-- end query 1 in stream 0 using template query45.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query46.sql",
    "content": "-- start query 1 in stream 0 using template query46.tpl and seed 803547492\nselect  c_last_name\n       ,c_first_name\n       ,ca_city\n       ,bought_city\n       ,ss_ticket_number\n       ,amt,profit \n from\n   (select ss_ticket_number\n          ,ss_customer_sk\n          ,ca_city bought_city\n          ,sum(ss_coupon_amt) amt\n          ,sum(ss_net_profit) profit\n    from store_sales,date_dim,store,household_demographics,customer_address \n    where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    and store_sales.ss_store_sk = store.s_store_sk  \n    and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    and store_sales.ss_addr_sk = customer_address.ca_address_sk\n    and (household_demographics.hd_dep_count = 0 or\n         household_demographics.hd_vehicle_count= 1)\n    and date_dim.d_dow in (6,0)\n    and date_dim.d_year in (2000,2000+1,2000+2) \n    and store.s_city in ('Five Forks','Oakland','Fairview','Winchester','Farmington') \n    group by ss_ticket_number,ss_customer_sk,ss_addr_sk,ca_city) dn,customer,customer_address current_addr\n    where ss_customer_sk = c_customer_sk\n      and customer.c_current_addr_sk = current_addr.ca_address_sk\n      and current_addr.ca_city <> bought_city\n  order by c_last_name\n          ,c_first_name\n          ,ca_city\n          ,bought_city\n          ,ss_ticket_number\n  limit 100;\n\n-- end query 1 in stream 0 using template query46.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query47.sql",
    "content": "-- start query 1 in stream 0 using template query47.tpl and seed 2031708268\nwith v1 as(\n select i_category, i_brand,\n        s_store_name, s_company_name,\n        d_year, d_moy,\n        sum(ss_sales_price) sum_sales,\n        avg(sum(ss_sales_price)) over\n          (partition by i_category, i_brand,\n                     s_store_name, s_company_name, d_year)\n          avg_monthly_sales,\n        rank() over\n          (partition by i_category, i_brand,\n                     s_store_name, s_company_name\n           order by d_year, d_moy) rn\n from item, store_sales, date_dim, store\n where ss_item_sk = i_item_sk and\n       ss_sold_date_sk = d_date_sk and\n       ss_store_sk = s_store_sk and\n       (\n         d_year = 1999 or\n         ( d_year = 1999-1 and d_moy =12) or\n         ( d_year = 1999+1 and d_moy =1)\n       )\n group by i_category, i_brand,\n          s_store_name, s_company_name,\n          d_year, d_moy),\n v2 as(\n select v1.s_store_name\n        ,v1.d_year, v1.d_moy\n        ,v1.avg_monthly_sales\n        ,v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum\n from v1, v1 v1_lag, v1 v1_lead\n where v1.i_category = v1_lag.i_category and\n       v1.i_category = v1_lead.i_category and\n       v1.i_brand = v1_lag.i_brand and\n       v1.i_brand = v1_lead.i_brand and\n       v1.s_store_name = v1_lag.s_store_name and\n       v1.s_store_name = v1_lead.s_store_name and\n       v1.s_company_name = v1_lag.s_company_name and\n       v1.s_company_name = v1_lead.s_company_name and\n       v1.rn = v1_lag.rn + 1 and\n       v1.rn = v1_lead.rn - 1)\n  select  *\n from v2\n where  d_year = 1999 and    \n        avg_monthly_sales > 0 and\n        case when avg_monthly_sales > 0 then abs(sum_sales - avg_monthly_sales) / avg_monthly_sales else null end > 0.1\n order by sum_sales - avg_monthly_sales, sum_sales\n limit 100;\n\n-- end query 1 in stream 0 using template query47.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query48.sql",
    "content": "-- start query 1 in stream 0 using template query48.tpl and seed 622697896\nselect sum (ss_quantity)\n from store_sales, store, customer_demographics, customer_address, date_dim\n where s_store_sk = ss_store_sk\n and  ss_sold_date_sk = d_date_sk and d_year = 1998\n and  \n (\n  (\n   cd_demo_sk = ss_cdemo_sk\n   and \n   cd_marital_status = 'M'\n   and \n   cd_education_status = 'Unknown'\n   and \n   ss_sales_price between 100.00 and 150.00  \n   )\n or\n  (\n  cd_demo_sk = ss_cdemo_sk\n   and \n   cd_marital_status = 'W'\n   and \n   cd_education_status = 'College'\n   and \n   ss_sales_price between 50.00 and 100.00   \n  )\n or \n (\n  cd_demo_sk = ss_cdemo_sk\n  and \n   cd_marital_status = 'D'\n   and \n   cd_education_status = 'Primary'\n   and \n   ss_sales_price between 150.00 and 200.00  \n )\n )\n and\n (\n  (\n  ss_addr_sk = ca_address_sk\n  and\n  ca_country = 'United States'\n  and\n  ca_state in ('MI', 'GA', 'NH')\n  and ss_net_profit between 0 and 2000  \n  )\n or\n  (ss_addr_sk = ca_address_sk\n  and\n  ca_country = 'United States'\n  and\n  ca_state in ('TX', 'KY', 'SD')\n  and ss_net_profit between 150 and 3000 \n  )\n or\n  (ss_addr_sk = ca_address_sk\n  and\n  ca_country = 'United States'\n  and\n  ca_state in ('NY', 'OH', 'FL')\n  and ss_net_profit between 50 and 25000 \n  )\n )\n;\n\n-- end query 1 in stream 0 using template query48.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query49.sql",
    "content": "-- start query 1 in stream 0 using template query49.tpl and seed 1819994127\nselect  channel, item, return_ratio, return_rank, currency_rank from\n (select\n 'web' as channel\n ,web.item as item\n ,web.return_ratio as return_ratio\n ,web.return_rank as return_rank\n ,web.currency_rank as currency_rank\n from (\n \tselect \n \t item\n \t,return_ratio\n \t,currency_ratio\n \t,rank() over (order by return_ratio) as return_rank\n \t,rank() over (order by currency_ratio) as currency_rank\n \tfrom\n \t(\tselect ws.ws_item_sk as item\n \t\t,(cast(sum(coalesce(wr.wr_return_quantity,0)) as decimal(15,4))/\n \t\tcast(sum(coalesce(ws.ws_quantity,0)) as decimal(15,4) )) as return_ratio\n \t\t,(cast(sum(coalesce(wr.wr_return_amt,0)) as decimal(15,4))/\n \t\tcast(sum(coalesce(ws.ws_net_paid,0)) as decimal(15,4) )) as currency_ratio\n \t\tfrom \n \t\t web_sales ws left outer join web_returns wr \n \t\t\ton (ws.ws_order_number = wr.wr_order_number and \n \t\t\tws.ws_item_sk = wr.wr_item_sk)\n                 ,date_dim\n \t\twhere \n \t\t\twr.wr_return_amt > 10000 \n \t\t\tand ws.ws_net_profit > 1\n                         and ws.ws_net_paid > 0\n                         and ws.ws_quantity > 0\n                         and ws_sold_date_sk = d_date_sk\n                         and d_year = 2000\n                         and d_moy = 12\n \t\tgroup by ws.ws_item_sk\n \t) in_web\n ) web\n where \n (\n web.return_rank <= 10\n or\n web.currency_rank <= 10\n )\n union\n select \n 'catalog' as channel\n ,catalog.item as item\n ,catalog.return_ratio as return_ratio\n ,catalog.return_rank as return_rank\n ,catalog.currency_rank as currency_rank\n from (\n \tselect \n \t item\n \t,return_ratio\n \t,currency_ratio\n \t,rank() over (order by return_ratio) as return_rank\n \t,rank() over (order by currency_ratio) as currency_rank\n \tfrom\n \t(\tselect \n \t\tcs.cs_item_sk as item\n \t\t,(cast(sum(coalesce(cr.cr_return_quantity,0)) as decimal(15,4))/\n \t\tcast(sum(coalesce(cs.cs_quantity,0)) as decimal(15,4) )) as return_ratio\n \t\t,(cast(sum(coalesce(cr.cr_return_amount,0)) as decimal(15,4))/\n \t\tcast(sum(coalesce(cs.cs_net_paid,0)) as decimal(15,4) )) as currency_ratio\n \t\tfrom \n \t\tcatalog_sales cs left outer join catalog_returns cr\n \t\t\ton (cs.cs_order_number = cr.cr_order_number and \n \t\t\tcs.cs_item_sk = cr.cr_item_sk)\n                ,date_dim\n \t\twhere \n \t\t\tcr.cr_return_amount > 10000 \n \t\t\tand cs.cs_net_profit > 1\n                         and cs.cs_net_paid > 0\n                         and cs.cs_quantity > 0\n                         and cs_sold_date_sk = d_date_sk\n                         and d_year = 2000\n                         and d_moy = 12\n                 group by cs.cs_item_sk\n \t) in_cat\n ) catalog\n where \n (\n catalog.return_rank <= 10\n or\n catalog.currency_rank <=10\n )\n union\n select \n 'store' as channel\n ,store.item as item\n ,store.return_ratio as return_ratio\n ,store.return_rank as return_rank\n ,store.currency_rank as currency_rank\n from (\n \tselect \n \t item\n \t,return_ratio\n \t,currency_ratio\n \t,rank() over (order by return_ratio) as return_rank\n \t,rank() over (order by currency_ratio) as currency_rank\n \tfrom\n \t(\tselect sts.ss_item_sk as item\n \t\t,(cast(sum(coalesce(sr.sr_return_quantity,0)) as decimal(15,4))/cast(sum(coalesce(sts.ss_quantity,0)) as decimal(15,4) )) as return_ratio\n \t\t,(cast(sum(coalesce(sr.sr_return_amt,0)) as decimal(15,4))/cast(sum(coalesce(sts.ss_net_paid,0)) as decimal(15,4) )) as currency_ratio\n \t\tfrom \n \t\tstore_sales sts left outer join store_returns sr\n \t\t\ton (sts.ss_ticket_number = sr.sr_ticket_number and sts.ss_item_sk = sr.sr_item_sk)\n                ,date_dim\n \t\twhere \n \t\t\tsr.sr_return_amt > 10000 \n \t\t\tand sts.ss_net_profit > 1\n                         and sts.ss_net_paid > 0 \n                         and sts.ss_quantity > 0\n                         and ss_sold_date_sk = d_date_sk\n                         and d_year = 2000\n                         and d_moy = 12\n \t\tgroup by sts.ss_item_sk\n \t) in_store\n ) store\n where  (\n store.return_rank <= 10\n or \n store.currency_rank <= 10\n )\n ) y\n order by 1,4,5,2\n limit 100;\n\n-- end query 1 in stream 0 using template query49.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query5.sql",
    "content": "-- start query 1 in stream 0 using template query5.tpl and seed 1819994127\nwith ssr as\n (select s_store_id,\n        sum(sales_price) as sales,\n        sum(profit) as profit,\n        sum(return_amt) as returns,\n        sum(net_loss) as profit_loss\n from\n  ( select  ss_store_sk as store_sk,\n            ss_sold_date_sk  as date_sk,\n            ss_ext_sales_price as sales_price,\n            ss_net_profit as profit,\n            cast(0 as decimal(7,2)) as return_amt,\n            cast(0 as decimal(7,2)) as net_loss\n    from store_sales\n    union all\n    select sr_store_sk as store_sk,\n           sr_returned_date_sk as date_sk,\n           cast(0 as decimal(7,2)) as sales_price,\n           cast(0 as decimal(7,2)) as profit,\n           sr_return_amt as return_amt,\n           sr_net_loss as net_loss\n    from store_returns\n   ) salesreturns,\n     date_dim,\n     store\n where date_sk = d_date_sk\n       and d_date between cast('2000-08-19' as date) \n                  and (cast('2000-08-19' as date) +  14 days)\n       and store_sk = s_store_sk\n group by s_store_id)\n ,\n csr as\n (select cp_catalog_page_id,\n        sum(sales_price) as sales,\n        sum(profit) as profit,\n        sum(return_amt) as returns,\n        sum(net_loss) as profit_loss\n from\n  ( select  cs_catalog_page_sk as page_sk,\n            cs_sold_date_sk  as date_sk,\n            cs_ext_sales_price as sales_price,\n            cs_net_profit as profit,\n            cast(0 as decimal(7,2)) as return_amt,\n            cast(0 as decimal(7,2)) as net_loss\n    from catalog_sales\n    union all\n    select cr_catalog_page_sk as page_sk,\n           cr_returned_date_sk as date_sk,\n           cast(0 as decimal(7,2)) as sales_price,\n           cast(0 as decimal(7,2)) as profit,\n           cr_return_amount as return_amt,\n           cr_net_loss as net_loss\n    from catalog_returns\n   ) salesreturns,\n     date_dim,\n     catalog_page\n where date_sk = d_date_sk\n       and d_date between cast('2000-08-19' as date)\n                  and (cast('2000-08-19' as date) +  14 days)\n       and page_sk = cp_catalog_page_sk\n group by cp_catalog_page_id)\n ,\n wsr as\n (select web_site_id,\n        sum(sales_price) as sales,\n        sum(profit) as profit,\n        sum(return_amt) as returns,\n        sum(net_loss) as profit_loss\n from\n  ( select  ws_web_site_sk as wsr_web_site_sk,\n            ws_sold_date_sk  as date_sk,\n            ws_ext_sales_price as sales_price,\n            ws_net_profit as profit,\n            cast(0 as decimal(7,2)) as return_amt,\n            cast(0 as decimal(7,2)) as net_loss\n    from web_sales\n    union all\n    select ws_web_site_sk as wsr_web_site_sk,\n           wr_returned_date_sk as date_sk,\n           cast(0 as decimal(7,2)) as sales_price,\n           cast(0 as decimal(7,2)) as profit,\n           wr_return_amt as return_amt,\n           wr_net_loss as net_loss\n    from web_returns left outer join web_sales on\n         ( wr_item_sk = ws_item_sk\n           and wr_order_number = ws_order_number)\n   ) salesreturns,\n     date_dim,\n     web_site\n where date_sk = d_date_sk\n       and d_date between cast('2000-08-19' as date)\n                  and (cast('2000-08-19' as date) +  14 days)\n       and wsr_web_site_sk = web_site_sk\n group by web_site_id)\n  select  channel\n        , id\n        , sum(sales) as sales\n        , sum(returns) as returns\n        , sum(profit) as profit\n from \n (select 'store channel' as channel\n        , 'store' || s_store_id as id\n        , sales\n        , returns\n        , (profit - profit_loss) as profit\n from   ssr\n union all\n select 'catalog channel' as channel\n        , 'catalog_page' || cp_catalog_page_id as id\n        , sales\n        , returns\n        , (profit - profit_loss) as profit\n from  csr\n union all\n select 'web channel' as channel\n        , 'web_site' || web_site_id as id\n        , sales\n        , returns\n        , (profit - profit_loss) as profit\n from   wsr\n ) x\n group by rollup (channel, id)\n order by channel\n         ,id\n limit 100;\n\n-- end query 1 in stream 0 using template query5.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query50.sql",
    "content": "-- start query 1 in stream 0 using template query50.tpl and seed 1819994127\nselect  \n   s_store_name\n  ,s_company_id\n  ,s_street_number\n  ,s_street_name\n  ,s_street_type\n  ,s_suite_number\n  ,s_city\n  ,s_county\n  ,s_state\n  ,s_zip\n  ,sum(case when (sr_returned_date_sk - ss_sold_date_sk <= 30 ) then 1 else 0 end)  as `30 days` \n  ,sum(case when (sr_returned_date_sk - ss_sold_date_sk > 30) and \n                 (sr_returned_date_sk - ss_sold_date_sk <= 60) then 1 else 0 end )  as `31-60 days` \n  ,sum(case when (sr_returned_date_sk - ss_sold_date_sk > 60) and \n                 (sr_returned_date_sk - ss_sold_date_sk <= 90) then 1 else 0 end)  as `61-90 days` \n  ,sum(case when (sr_returned_date_sk - ss_sold_date_sk > 90) and\n                 (sr_returned_date_sk - ss_sold_date_sk <= 120) then 1 else 0 end)  as `91-120 days` \n  ,sum(case when (sr_returned_date_sk - ss_sold_date_sk  > 120) then 1 else 0 end)  as `>120 days` \nfrom\n   store_sales\n  ,store_returns\n  ,store\n  ,date_dim d1\n  ,date_dim d2\nwhere\n    d2.d_year = 1998\nand d2.d_moy  = 9\nand ss_ticket_number = sr_ticket_number\nand ss_item_sk = sr_item_sk\nand ss_sold_date_sk   = d1.d_date_sk\nand sr_returned_date_sk   = d2.d_date_sk\nand ss_customer_sk = sr_customer_sk\nand ss_store_sk = s_store_sk\ngroup by\n   s_store_name\n  ,s_company_id\n  ,s_street_number\n  ,s_street_name\n  ,s_street_type\n  ,s_suite_number\n  ,s_city\n  ,s_county\n  ,s_state\n  ,s_zip\norder by s_store_name\n        ,s_company_id\n        ,s_street_number\n        ,s_street_name\n        ,s_street_type\n        ,s_suite_number\n        ,s_city\n        ,s_county\n        ,s_state\n        ,s_zip\nlimit 100;\n\n-- end query 1 in stream 0 using template query50.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query51.sql",
    "content": "-- start query 1 in stream 0 using template query51.tpl and seed 1819994127\nWITH web_v1 as (\nselect\n  ws_item_sk item_sk, d_date,\n  sum(sum(ws_sales_price))\n      over (partition by ws_item_sk order by d_date rows between unbounded preceding and current row) cume_sales\nfrom web_sales\n    ,date_dim\nwhere ws_sold_date_sk=d_date_sk\n  and d_month_seq between 1214 and 1214+11\n  and ws_item_sk is not NULL\ngroup by ws_item_sk, d_date),\nstore_v1 as (\nselect\n  ss_item_sk item_sk, d_date,\n  sum(sum(ss_sales_price))\n      over (partition by ss_item_sk order by d_date rows between unbounded preceding and current row) cume_sales\nfrom store_sales\n    ,date_dim\nwhere ss_sold_date_sk=d_date_sk\n  and d_month_seq between 1214 and 1214+11\n  and ss_item_sk is not NULL\ngroup by ss_item_sk, d_date)\n select  *\nfrom (select item_sk\n     ,d_date\n     ,web_sales\n     ,store_sales\n     ,max(web_sales)\n         over (partition by item_sk order by d_date rows between unbounded preceding and current row) web_cumulative\n     ,max(store_sales)\n         over (partition by item_sk order by d_date rows between unbounded preceding and current row) store_cumulative\n     from (select case when web.item_sk is not null then web.item_sk else store.item_sk end item_sk\n                 ,case when web.d_date is not null then web.d_date else store.d_date end d_date\n                 ,web.cume_sales web_sales\n                 ,store.cume_sales store_sales\n           from web_v1 web full outer join store_v1 store on (web.item_sk = store.item_sk\n                                                          and web.d_date = store.d_date)\n          )x )y\nwhere web_cumulative > store_cumulative\norder by item_sk\n        ,d_date\nlimit 100;\n\n-- end query 1 in stream 0 using template query51.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query52.sql",
    "content": "-- start query 1 in stream 0 using template query52.tpl and seed 1819994127\nselect  dt.d_year\n \t,item.i_brand_id brand_id\n \t,item.i_brand brand\n \t,sum(ss_ext_sales_price) ext_price\n from date_dim dt\n     ,store_sales\n     ,item\n where dt.d_date_sk = store_sales.ss_sold_date_sk\n    and store_sales.ss_item_sk = item.i_item_sk\n    and item.i_manager_id = 1\n    and dt.d_moy=12\n    and dt.d_year=2000\n group by dt.d_year\n \t,item.i_brand\n \t,item.i_brand_id\n order by dt.d_year\n \t,ext_price desc\n \t,brand_id\nlimit 100 ;\n\n-- end query 1 in stream 0 using template query52.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query53.sql",
    "content": "-- start query 1 in stream 0 using template query53.tpl and seed 1819994127\nselect  * from \n(select i_manufact_id,\nsum(ss_sales_price) sum_sales,\navg(sum(ss_sales_price)) over (partition by i_manufact_id) avg_quarterly_sales\nfrom item, store_sales, date_dim, store\nwhere ss_item_sk = i_item_sk and\nss_sold_date_sk = d_date_sk and\nss_store_sk = s_store_sk and\nd_month_seq in (1212,1212+1,1212+2,1212+3,1212+4,1212+5,1212+6,1212+7,1212+8,1212+9,1212+10,1212+11) and\n((i_category in ('Books','Children','Electronics') and\ni_class in ('personal','portable','reference','self-help') and\ni_brand in ('scholaramalgamalg #14','scholaramalgamalg #7',\n\t\t'exportiunivamalg #9','scholaramalgamalg #9'))\nor(i_category in ('Women','Music','Men') and\ni_class in ('accessories','classical','fragrances','pants') and\ni_brand in ('amalgimporto #1','edu packscholar #1','exportiimporto #1',\n\t\t'importoamalg #1')))\ngroup by i_manufact_id, d_qoy ) tmp1\nwhere case when avg_quarterly_sales > 0 \n\tthen abs (sum_sales - avg_quarterly_sales)/ avg_quarterly_sales \n\telse null end > 0.1\norder by avg_quarterly_sales,\n\t sum_sales,\n\t i_manufact_id\nlimit 100;\n\n-- end query 1 in stream 0 using template query53.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query54.sql",
    "content": "-- start query 1 in stream 0 using template query54.tpl and seed 1930872976\nwith my_customers as (\n select distinct c_customer_sk\n        , c_current_addr_sk\n from   \n        ( select cs_sold_date_sk sold_date_sk,\n                 cs_bill_customer_sk customer_sk,\n                 cs_item_sk item_sk\n          from   catalog_sales\n          union all\n          select ws_sold_date_sk sold_date_sk,\n                 ws_bill_customer_sk customer_sk,\n                 ws_item_sk item_sk\n          from   web_sales\n         ) cs_or_ws_sales,\n         item,\n         date_dim,\n         customer\n where   sold_date_sk = d_date_sk\n         and item_sk = i_item_sk\n         and i_category = 'Books'\n         and i_class = 'business'\n         and c_customer_sk = cs_or_ws_sales.customer_sk\n         and d_moy = 2\n         and d_year = 2000\n )\n , my_revenue as (\n select c_customer_sk,\n        sum(ss_ext_sales_price) as revenue\n from   my_customers,\n        store_sales,\n        customer_address,\n        store,\n        date_dim\n where  c_current_addr_sk = ca_address_sk\n        and ca_county = s_county\n        and ca_state = s_state\n        and ss_sold_date_sk = d_date_sk\n        and c_customer_sk = ss_customer_sk\n        and d_month_seq between (select distinct d_month_seq+1\n                                 from   date_dim where d_year = 2000 and d_moy = 2)\n                           and  (select distinct d_month_seq+3\n                                 from   date_dim where d_year = 2000 and d_moy = 2)\n group by c_customer_sk\n )\n , segments as\n (select cast((revenue/50) as int) as segment\n  from   my_revenue\n )\n  select  segment, count(*) as num_customers, segment*50 as segment_base\n from segments\n group by segment\n order by segment, num_customers\n limit 100;\n\n-- end query 1 in stream 0 using template query54.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query55.sql",
    "content": "-- start query 1 in stream 0 using template query55.tpl and seed 2031708268\nselect  i_brand_id brand_id, i_brand brand,\n \tsum(ss_ext_sales_price) ext_price\n from date_dim, store_sales, item\n where d_date_sk = ss_sold_date_sk\n \tand ss_item_sk = i_item_sk\n \tand i_manager_id=13\n \tand d_moy=11\n \tand d_year=1999\n group by i_brand, i_brand_id\n order by ext_price desc, i_brand_id\nlimit 100 ;\n\n-- end query 1 in stream 0 using template query55.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query56.sql",
    "content": "-- start query 1 in stream 0 using template query56.tpl and seed 1951559352\nwith ss as (\n select i_item_id,sum(ss_ext_sales_price) total_sales\n from\n \tstore_sales,\n \tdate_dim,\n         customer_address,\n         item\n where i_item_id in (select\n     i_item_id\nfrom item\nwhere i_color in ('chiffon','smoke','lace'))\n and     ss_item_sk              = i_item_sk\n and     ss_sold_date_sk         = d_date_sk\n and     d_year                  = 2001\n and     d_moy                   = 5\n and     ss_addr_sk              = ca_address_sk\n and     ca_gmt_offset           = -6 \n group by i_item_id),\n cs as (\n select i_item_id,sum(cs_ext_sales_price) total_sales\n from\n \tcatalog_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_item_id               in (select\n  i_item_id\nfrom item\nwhere i_color in ('chiffon','smoke','lace'))\n and     cs_item_sk              = i_item_sk\n and     cs_sold_date_sk         = d_date_sk\n and     d_year                  = 2001\n and     d_moy                   = 5\n and     cs_bill_addr_sk         = ca_address_sk\n and     ca_gmt_offset           = -6 \n group by i_item_id),\n ws as (\n select i_item_id,sum(ws_ext_sales_price) total_sales\n from\n \tweb_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_item_id               in (select\n  i_item_id\nfrom item\nwhere i_color in ('chiffon','smoke','lace'))\n and     ws_item_sk              = i_item_sk\n and     ws_sold_date_sk         = d_date_sk\n and     d_year                  = 2001\n and     d_moy                   = 5\n and     ws_bill_addr_sk         = ca_address_sk\n and     ca_gmt_offset           = -6\n group by i_item_id)\n  select  i_item_id ,sum(total_sales) total_sales\n from  (select * from ss \n        union all\n        select * from cs \n        union all\n        select * from ws) tmp1\n group by i_item_id\n order by total_sales,\n          i_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query56.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query57.sql",
    "content": "-- start query 1 in stream 0 using template query57.tpl and seed 2031708268\nwith v1 as(\n select i_category, i_brand,\n        cc_name,\n        d_year, d_moy,\n        sum(cs_sales_price) sum_sales,\n        avg(sum(cs_sales_price)) over\n          (partition by i_category, i_brand,\n                     cc_name, d_year)\n          avg_monthly_sales,\n        rank() over\n          (partition by i_category, i_brand,\n                     cc_name\n           order by d_year, d_moy) rn\n from item, catalog_sales, date_dim, call_center\n where cs_item_sk = i_item_sk and\n       cs_sold_date_sk = d_date_sk and\n       cc_call_center_sk= cs_call_center_sk and\n       (\n         d_year = 1999 or\n         ( d_year = 1999-1 and d_moy =12) or\n         ( d_year = 1999+1 and d_moy =1)\n       )\n group by i_category, i_brand,\n          cc_name , d_year, d_moy),\n v2 as(\n select v1.i_category, v1.i_brand\n        ,v1.d_year, v1.d_moy\n        ,v1.avg_monthly_sales\n        ,v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum\n from v1, v1 v1_lag, v1 v1_lead\n where v1.i_category = v1_lag.i_category and\n       v1.i_category = v1_lead.i_category and\n       v1.i_brand = v1_lag.i_brand and\n       v1.i_brand = v1_lead.i_brand and\n       v1. cc_name = v1_lag. cc_name and\n       v1. cc_name = v1_lead. cc_name and\n       v1.rn = v1_lag.rn + 1 and\n       v1.rn = v1_lead.rn - 1)\n  select  *\n from v2\n where  d_year = 1999 and\n        avg_monthly_sales > 0 and\n        case when avg_monthly_sales > 0 then abs(sum_sales - avg_monthly_sales) / avg_monthly_sales else null end > 0.1\n order by sum_sales - avg_monthly_sales, avg_monthly_sales\n limit 100;\n\n-- end query 1 in stream 0 using template query57.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query58.sql",
    "content": "-- start query 1 in stream 0 using template query58.tpl and seed 1819994127\nwith ss_items as\n (select i_item_id item_id\n        ,sum(ss_ext_sales_price) ss_item_rev \n from store_sales\n     ,item\n     ,date_dim\n where ss_item_sk = i_item_sk\n   and d_date in (select d_date\n                  from date_dim\n                  where d_week_seq = (select d_week_seq \n                                      from date_dim\n                                      where d_date = '1998-02-21'))\n   and ss_sold_date_sk   = d_date_sk\n group by i_item_id),\n cs_items as\n (select i_item_id item_id\n        ,sum(cs_ext_sales_price) cs_item_rev\n  from catalog_sales\n      ,item\n      ,date_dim\n where cs_item_sk = i_item_sk\n  and  d_date in (select d_date\n                  from date_dim\n                  where d_week_seq = (select d_week_seq \n                                      from date_dim\n                                      where d_date = '1998-02-21'))\n  and  cs_sold_date_sk = d_date_sk\n group by i_item_id),\n ws_items as\n (select i_item_id item_id\n        ,sum(ws_ext_sales_price) ws_item_rev\n  from web_sales\n      ,item\n      ,date_dim\n where ws_item_sk = i_item_sk\n  and  d_date in (select d_date\n                  from date_dim\n                  where d_week_seq =(select d_week_seq \n                                     from date_dim\n                                     where d_date = '1998-02-21'))\n  and ws_sold_date_sk   = d_date_sk\n group by i_item_id)\n  select  ss_items.item_id\n       ,ss_item_rev\n       ,ss_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3) * 100 ss_dev\n       ,cs_item_rev\n       ,cs_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3) * 100 cs_dev\n       ,ws_item_rev\n       ,ws_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3) * 100 ws_dev\n       ,(ss_item_rev+cs_item_rev+ws_item_rev)/3 average\n from ss_items,cs_items,ws_items\n where ss_items.item_id=cs_items.item_id\n   and ss_items.item_id=ws_items.item_id \n   and ss_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev\n   and ss_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev\n   and cs_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev\n   and cs_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev\n   and ws_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev\n   and ws_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev\n order by item_id\n         ,ss_item_rev\n limit 100;\n\n-- end query 1 in stream 0 using template query58.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query59.sql",
    "content": "-- start query 1 in stream 0 using template query59.tpl and seed 1819994127\nwith wss as \n (select d_week_seq,\n        ss_store_sk,\n        sum(case when (d_day_name='Sunday') then ss_sales_price else null end) sun_sales,\n        sum(case when (d_day_name='Monday') then ss_sales_price else null end) mon_sales,\n        sum(case when (d_day_name='Tuesday') then ss_sales_price else  null end) tue_sales,\n        sum(case when (d_day_name='Wednesday') then ss_sales_price else null end) wed_sales,\n        sum(case when (d_day_name='Thursday') then ss_sales_price else null end) thu_sales,\n        sum(case when (d_day_name='Friday') then ss_sales_price else null end) fri_sales,\n        sum(case when (d_day_name='Saturday') then ss_sales_price else null end) sat_sales\n from store_sales,date_dim\n where d_date_sk = ss_sold_date_sk\n group by d_week_seq,ss_store_sk\n )\n  select  s_store_name1,s_store_id1,d_week_seq1\n       ,sun_sales1/sun_sales2,mon_sales1/mon_sales2\n       ,tue_sales1/tue_sales2,wed_sales1/wed_sales2,thu_sales1/thu_sales2\n       ,fri_sales1/fri_sales2,sat_sales1/sat_sales2\n from\n (select s_store_name s_store_name1,wss.d_week_seq d_week_seq1\n        ,s_store_id s_store_id1,sun_sales sun_sales1\n        ,mon_sales mon_sales1,tue_sales tue_sales1\n        ,wed_sales wed_sales1,thu_sales thu_sales1\n        ,fri_sales fri_sales1,sat_sales sat_sales1\n  from wss,store,date_dim d\n  where d.d_week_seq = wss.d_week_seq and\n        ss_store_sk = s_store_sk and \n        d_month_seq between 1205 and 1205 + 11) y,\n (select s_store_name s_store_name2,wss.d_week_seq d_week_seq2\n        ,s_store_id s_store_id2,sun_sales sun_sales2\n        ,mon_sales mon_sales2,tue_sales tue_sales2\n        ,wed_sales wed_sales2,thu_sales thu_sales2\n        ,fri_sales fri_sales2,sat_sales sat_sales2\n  from wss,store,date_dim d\n  where d.d_week_seq = wss.d_week_seq and\n        ss_store_sk = s_store_sk and \n        d_month_seq between 1205+ 12 and 1205 + 23) x\n where s_store_id1=s_store_id2\n   and d_week_seq1=d_week_seq2-52\n order by s_store_name1,s_store_id1,d_week_seq1\nlimit 100;\n\n-- end query 1 in stream 0 using template query59.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query6.sql",
    "content": "-- start query 1 in stream 0 using template query6.tpl and seed 1819994127\nselect  a.ca_state state, count(*) cnt\n from customer_address a\n     ,customer c\n     ,store_sales s\n     ,date_dim d\n     ,item i\n where       a.ca_address_sk = c.c_current_addr_sk\n \tand c.c_customer_sk = s.ss_customer_sk\n \tand s.ss_sold_date_sk = d.d_date_sk\n \tand s.ss_item_sk = i.i_item_sk\n \tand d.d_month_seq = \n \t     (select distinct (d_month_seq)\n \t      from date_dim\n               where d_year = 2002\n \t        and d_moy = 3 )\n \tand i.i_current_price > 1.2 * \n             (select avg(j.i_current_price) \n \t     from item j \n \t     where j.i_category = i.i_category)\n group by a.ca_state\n having count(*) >= 10\n order by cnt, a.ca_state \n limit 100;\n\n-- end query 1 in stream 0 using template query6.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query60.sql",
    "content": "-- start query 1 in stream 0 using template query60.tpl and seed 1930872976\nwith ss as (\n select\n          i_item_id,sum(ss_ext_sales_price) total_sales\n from\n \tstore_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_item_id in (select\n  i_item_id\nfrom\n item\nwhere i_category in ('Children'))\n and     ss_item_sk              = i_item_sk\n and     ss_sold_date_sk         = d_date_sk\n and     d_year                  = 1998\n and     d_moy                   = 10\n and     ss_addr_sk              = ca_address_sk\n and     ca_gmt_offset           = -5 \n group by i_item_id),\n cs as (\n select\n          i_item_id,sum(cs_ext_sales_price) total_sales\n from\n \tcatalog_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_item_id               in (select\n  i_item_id\nfrom\n item\nwhere i_category in ('Children'))\n and     cs_item_sk              = i_item_sk\n and     cs_sold_date_sk         = d_date_sk\n and     d_year                  = 1998\n and     d_moy                   = 10\n and     cs_bill_addr_sk         = ca_address_sk\n and     ca_gmt_offset           = -5 \n group by i_item_id),\n ws as (\n select\n          i_item_id,sum(ws_ext_sales_price) total_sales\n from\n \tweb_sales,\n \tdate_dim,\n         customer_address,\n         item\n where\n         i_item_id               in (select\n  i_item_id\nfrom\n item\nwhere i_category in ('Children'))\n and     ws_item_sk              = i_item_sk\n and     ws_sold_date_sk         = d_date_sk\n and     d_year                  = 1998\n and     d_moy                   = 10\n and     ws_bill_addr_sk         = ca_address_sk\n and     ca_gmt_offset           = -5\n group by i_item_id)\n  select   \n  i_item_id\n,sum(total_sales) total_sales\n from  (select * from ss \n        union all\n        select * from cs \n        union all\n        select * from ws) tmp1\n group by i_item_id\n order by i_item_id\n      ,total_sales\n limit 100;\n\n-- end query 1 in stream 0 using template query60.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query61.sql",
    "content": "-- start query 1 in stream 0 using template query61.tpl and seed 1930872976\nselect  promotions,total,cast(promotions as decimal(15,4))/cast(total as decimal(15,4))*100\nfrom\n  (select sum(ss_ext_sales_price) promotions\n   from  store_sales\n        ,store\n        ,promotion\n        ,date_dim\n        ,customer\n        ,customer_address \n        ,item\n   where ss_sold_date_sk = d_date_sk\n   and   ss_store_sk = s_store_sk\n   and   ss_promo_sk = p_promo_sk\n   and   ss_customer_sk= c_customer_sk\n   and   ca_address_sk = c_current_addr_sk\n   and   ss_item_sk = i_item_sk \n   and   ca_gmt_offset = -6\n   and   i_category = 'Sports'\n   and   (p_channel_dmail = 'Y' or p_channel_email = 'Y' or p_channel_tv = 'Y')\n   and   s_gmt_offset = -6\n   and   d_year = 2001\n   and   d_moy  = 12) promotional_sales,\n  (select sum(ss_ext_sales_price) total\n   from  store_sales\n        ,store\n        ,date_dim\n        ,customer\n        ,customer_address\n        ,item\n   where ss_sold_date_sk = d_date_sk\n   and   ss_store_sk = s_store_sk\n   and   ss_customer_sk= c_customer_sk\n   and   ca_address_sk = c_current_addr_sk\n   and   ss_item_sk = i_item_sk\n   and   ca_gmt_offset = -6\n   and   i_category = 'Sports'\n   and   s_gmt_offset = -6\n   and   d_year = 2001\n   and   d_moy  = 12) all_sales\norder by promotions, total\nlimit 100;\n\n-- end query 1 in stream 0 using template query61.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query62.sql",
    "content": "-- start query 1 in stream 0 using template query62.tpl and seed 1819994127\nselect  \n   substr(w_warehouse_name,1,20)\n  ,sm_type\n  ,web_name\n  ,sum(case when (ws_ship_date_sk - ws_sold_date_sk <= 30 ) then 1 else 0 end)  as `30 days` \n  ,sum(case when (ws_ship_date_sk - ws_sold_date_sk > 30) and \n                 (ws_ship_date_sk - ws_sold_date_sk <= 60) then 1 else 0 end )  as `31-60 days` \n  ,sum(case when (ws_ship_date_sk - ws_sold_date_sk > 60) and \n                 (ws_ship_date_sk - ws_sold_date_sk <= 90) then 1 else 0 end)  as `61-90 days` \n  ,sum(case when (ws_ship_date_sk - ws_sold_date_sk > 90) and\n                 (ws_ship_date_sk - ws_sold_date_sk <= 120) then 1 else 0 end)  as `91-120 days` \n  ,sum(case when (ws_ship_date_sk - ws_sold_date_sk  > 120) then 1 else 0 end)  as `>120 days` \nfrom\n   web_sales\n  ,warehouse\n  ,ship_mode\n  ,web_site\n  ,date_dim\nwhere\n    d_month_seq between 1215 and 1215 + 11\nand ws_ship_date_sk   = d_date_sk\nand ws_warehouse_sk   = w_warehouse_sk\nand ws_ship_mode_sk   = sm_ship_mode_sk\nand ws_web_site_sk    = web_site_sk\ngroup by\n   substr(w_warehouse_name,1,20)\n  ,sm_type\n  ,web_name\norder by substr(w_warehouse_name,1,20)\n        ,sm_type\n       ,web_name\nlimit 100;\n\n-- end query 1 in stream 0 using template query62.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query63.sql",
    "content": "-- start query 1 in stream 0 using template query63.tpl and seed 1819994127\nselect  * \nfrom (select i_manager_id\n             ,sum(ss_sales_price) sum_sales\n             ,avg(sum(ss_sales_price)) over (partition by i_manager_id) avg_monthly_sales\n      from item\n          ,store_sales\n          ,date_dim\n          ,store\n      where ss_item_sk = i_item_sk\n        and ss_sold_date_sk = d_date_sk\n        and ss_store_sk = s_store_sk\n        and d_month_seq in (1211,1211+1,1211+2,1211+3,1211+4,1211+5,1211+6,1211+7,1211+8,1211+9,1211+10,1211+11)\n        and ((    i_category in ('Books','Children','Electronics')\n              and i_class in ('personal','portable','reference','self-help')\n              and i_brand in ('scholaramalgamalg #14','scholaramalgamalg #7',\n\t\t                  'exportiunivamalg #9','scholaramalgamalg #9'))\n           or(    i_category in ('Women','Music','Men')\n              and i_class in ('accessories','classical','fragrances','pants')\n              and i_brand in ('amalgimporto #1','edu packscholar #1','exportiimporto #1',\n\t\t                 'importoamalg #1')))\ngroup by i_manager_id, d_moy) tmp1\nwhere case when avg_monthly_sales > 0 then abs (sum_sales - avg_monthly_sales) / avg_monthly_sales else null end > 0.1\norder by i_manager_id\n        ,avg_monthly_sales\n        ,sum_sales\nlimit 100;\n\n-- end query 1 in stream 0 using template query63.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query64.sql",
    "content": "-- start query 1 in stream 0 using template query64.tpl and seed 1220860970\nwith cs_ui as\n (select cs_item_sk\n        ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund\n  from catalog_sales\n      ,catalog_returns\n  where cs_item_sk = cr_item_sk\n    and cs_order_number = cr_order_number\n  group by cs_item_sk\n  having sum(cs_ext_list_price)>2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)),\ncross_sales as\n (select i_product_name product_name\n     ,i_item_sk item_sk\n     ,s_store_name store_name\n     ,s_zip store_zip\n     ,ad1.ca_street_number b_street_number\n     ,ad1.ca_street_name b_street_name\n     ,ad1.ca_city b_city\n     ,ad1.ca_zip b_zip\n     ,ad2.ca_street_number c_street_number\n     ,ad2.ca_street_name c_street_name\n     ,ad2.ca_city c_city\n     ,ad2.ca_zip c_zip\n     ,d1.d_year as syear\n     ,d2.d_year as fsyear\n     ,d3.d_year s2year\n     ,count(*) cnt\n     ,sum(ss_wholesale_cost) s1\n     ,sum(ss_list_price) s2\n     ,sum(ss_coupon_amt) s3\n  FROM   store_sales\n        ,store_returns\n        ,cs_ui\n        ,date_dim d1\n        ,date_dim d2\n        ,date_dim d3\n        ,store\n        ,customer\n        ,customer_demographics cd1\n        ,customer_demographics cd2\n        ,promotion\n        ,household_demographics hd1\n        ,household_demographics hd2\n        ,customer_address ad1\n        ,customer_address ad2\n        ,income_band ib1\n        ,income_band ib2\n        ,item\n  WHERE  ss_store_sk = s_store_sk AND\n         ss_sold_date_sk = d1.d_date_sk AND\n         ss_customer_sk = c_customer_sk AND\n         ss_cdemo_sk= cd1.cd_demo_sk AND\n         ss_hdemo_sk = hd1.hd_demo_sk AND\n         ss_addr_sk = ad1.ca_address_sk and\n         ss_item_sk = i_item_sk and\n         ss_item_sk = sr_item_sk and\n         ss_ticket_number = sr_ticket_number and\n         ss_item_sk = cs_ui.cs_item_sk and\n         c_current_cdemo_sk = cd2.cd_demo_sk AND\n         c_current_hdemo_sk = hd2.hd_demo_sk AND\n         c_current_addr_sk = ad2.ca_address_sk and\n         c_first_sales_date_sk = d2.d_date_sk and\n         c_first_shipto_date_sk = d3.d_date_sk and\n         ss_promo_sk = p_promo_sk and\n         hd1.hd_income_band_sk = ib1.ib_income_band_sk and\n         hd2.hd_income_band_sk = ib2.ib_income_band_sk and\n         cd1.cd_marital_status <> cd2.cd_marital_status and\n         i_color in ('azure','gainsboro','misty','blush','hot','lemon') and\n         i_current_price between 80 and 80 + 10 and\n         i_current_price between 80 + 1 and 80 + 15\ngroup by i_product_name\n       ,i_item_sk\n       ,s_store_name\n       ,s_zip\n       ,ad1.ca_street_number\n       ,ad1.ca_street_name\n       ,ad1.ca_city\n       ,ad1.ca_zip\n       ,ad2.ca_street_number\n       ,ad2.ca_street_name\n       ,ad2.ca_city\n       ,ad2.ca_zip\n       ,d1.d_year\n       ,d2.d_year\n       ,d3.d_year\n)\nselect cs1.product_name\n     ,cs1.store_name\n     ,cs1.store_zip\n     ,cs1.b_street_number\n     ,cs1.b_street_name\n     ,cs1.b_city\n     ,cs1.b_zip\n     ,cs1.c_street_number\n     ,cs1.c_street_name\n     ,cs1.c_city\n     ,cs1.c_zip\n     ,cs1.syear\n     ,cs1.cnt\n     ,cs1.s1 as s11\n     ,cs1.s2 as s21\n     ,cs1.s3 as s31\n     ,cs2.s1 as s12\n     ,cs2.s2 as s22\n     ,cs2.s3 as s32\n     ,cs2.syear\n     ,cs2.cnt\nfrom cross_sales cs1,cross_sales cs2\nwhere cs1.item_sk=cs2.item_sk and\n     cs1.syear = 1999 and\n     cs2.syear = 1999 + 1 and\n     cs2.cnt <= cs1.cnt and\n     cs1.store_name = cs2.store_name and\n     cs1.store_zip = cs2.store_zip\norder by cs1.product_name\n       ,cs1.store_name\n       ,cs2.cnt\n       ,cs1.s1\n       ,cs2.s1;\n\n-- end query 1 in stream 0 using template query64.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query65.sql",
    "content": "-- start query 1 in stream 0 using template query65.tpl and seed 1819994127\nselect \n\ts_store_name,\n\ti_item_desc,\n\tsc.revenue,\n\ti_current_price,\n\ti_wholesale_cost,\n\ti_brand\n from store, item,\n     (select ss_store_sk, avg(revenue) as ave\n \tfrom\n \t    (select  ss_store_sk, ss_item_sk, \n \t\t     sum(ss_sales_price) as revenue\n \t\tfrom store_sales, date_dim\n \t\twhere ss_sold_date_sk = d_date_sk and d_month_seq between 1186 and 1186+11\n \t\tgroup by ss_store_sk, ss_item_sk) sa\n \tgroup by ss_store_sk) sb,\n     (select  ss_store_sk, ss_item_sk, sum(ss_sales_price) as revenue\n \tfrom store_sales, date_dim\n \twhere ss_sold_date_sk = d_date_sk and d_month_seq between 1186 and 1186+11\n \tgroup by ss_store_sk, ss_item_sk) sc\n where sb.ss_store_sk = sc.ss_store_sk and \n       sc.revenue <= 0.1 * sb.ave and\n       s_store_sk = sc.ss_store_sk and\n       i_item_sk = sc.ss_item_sk\n order by s_store_name, i_item_desc\nlimit 100;\n\n-- end query 1 in stream 0 using template query65.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query66.sql",
    "content": "-- start query 1 in stream 0 using template query66.tpl and seed 2042478054\nselect   \n         w_warehouse_name\n \t,w_warehouse_sq_ft\n \t,w_city\n \t,w_county\n \t,w_state\n \t,w_country\n        ,ship_carriers\n        ,year\n \t,sum(jan_sales) as jan_sales\n \t,sum(feb_sales) as feb_sales\n \t,sum(mar_sales) as mar_sales\n \t,sum(apr_sales) as apr_sales\n \t,sum(may_sales) as may_sales\n \t,sum(jun_sales) as jun_sales\n \t,sum(jul_sales) as jul_sales\n \t,sum(aug_sales) as aug_sales\n \t,sum(sep_sales) as sep_sales\n \t,sum(oct_sales) as oct_sales\n \t,sum(nov_sales) as nov_sales\n \t,sum(dec_sales) as dec_sales\n \t,sum(jan_sales/w_warehouse_sq_ft) as jan_sales_per_sq_foot\n \t,sum(feb_sales/w_warehouse_sq_ft) as feb_sales_per_sq_foot\n \t,sum(mar_sales/w_warehouse_sq_ft) as mar_sales_per_sq_foot\n \t,sum(apr_sales/w_warehouse_sq_ft) as apr_sales_per_sq_foot\n \t,sum(may_sales/w_warehouse_sq_ft) as may_sales_per_sq_foot\n \t,sum(jun_sales/w_warehouse_sq_ft) as jun_sales_per_sq_foot\n \t,sum(jul_sales/w_warehouse_sq_ft) as jul_sales_per_sq_foot\n \t,sum(aug_sales/w_warehouse_sq_ft) as aug_sales_per_sq_foot\n \t,sum(sep_sales/w_warehouse_sq_ft) as sep_sales_per_sq_foot\n \t,sum(oct_sales/w_warehouse_sq_ft) as oct_sales_per_sq_foot\n \t,sum(nov_sales/w_warehouse_sq_ft) as nov_sales_per_sq_foot\n \t,sum(dec_sales/w_warehouse_sq_ft) as dec_sales_per_sq_foot\n \t,sum(jan_net) as jan_net\n \t,sum(feb_net) as feb_net\n \t,sum(mar_net) as mar_net\n \t,sum(apr_net) as apr_net\n \t,sum(may_net) as may_net\n \t,sum(jun_net) as jun_net\n \t,sum(jul_net) as jul_net\n \t,sum(aug_net) as aug_net\n \t,sum(sep_net) as sep_net\n \t,sum(oct_net) as oct_net\n \t,sum(nov_net) as nov_net\n \t,sum(dec_net) as dec_net\n from (\n     select \n \tw_warehouse_name\n \t,w_warehouse_sq_ft\n \t,w_city\n \t,w_county\n \t,w_state\n \t,w_country\n \t,'MSC' || ',' || 'GERMA' as ship_carriers\n       ,d_year as year\n \t,sum(case when d_moy = 1 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as jan_sales\n \t,sum(case when d_moy = 2 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as feb_sales\n \t,sum(case when d_moy = 3 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as mar_sales\n \t,sum(case when d_moy = 4 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as apr_sales\n \t,sum(case when d_moy = 5 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as may_sales\n \t,sum(case when d_moy = 6 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as jun_sales\n \t,sum(case when d_moy = 7 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as jul_sales\n \t,sum(case when d_moy = 8 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as aug_sales\n \t,sum(case when d_moy = 9 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as sep_sales\n \t,sum(case when d_moy = 10 \n \t\tthen ws_sales_price* ws_quantity else 0 end) as oct_sales\n \t,sum(case when d_moy = 11\n \t\tthen ws_sales_price* ws_quantity else 0 end) as nov_sales\n \t,sum(case when d_moy = 12\n \t\tthen ws_sales_price* ws_quantity else 0 end) as dec_sales\n \t,sum(case when d_moy = 1 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as jan_net\n \t,sum(case when d_moy = 2\n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as feb_net\n \t,sum(case when d_moy = 3 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as mar_net\n \t,sum(case when d_moy = 4 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as apr_net\n \t,sum(case when d_moy = 5 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as may_net\n \t,sum(case when d_moy = 6 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as jun_net\n \t,sum(case when d_moy = 7 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as jul_net\n \t,sum(case when d_moy = 8 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as aug_net\n \t,sum(case when d_moy = 9 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as sep_net\n \t,sum(case when d_moy = 10 \n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as oct_net\n \t,sum(case when d_moy = 11\n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as nov_net\n \t,sum(case when d_moy = 12\n \t\tthen ws_net_paid_inc_ship_tax * ws_quantity else 0 end) as dec_net\n     from\n          web_sales\n         ,warehouse\n         ,date_dim\n         ,time_dim\n \t  ,ship_mode\n     where\n            ws_warehouse_sk =  w_warehouse_sk\n        and ws_sold_date_sk = d_date_sk\n        and ws_sold_time_sk = t_time_sk\n \tand ws_ship_mode_sk = sm_ship_mode_sk\n        and d_year = 2001\n \tand t_time between 9453 and 9453+28800 \n \tand sm_carrier in ('MSC','GERMA')\n     group by \n        w_warehouse_name\n \t,w_warehouse_sq_ft\n \t,w_city\n \t,w_county\n \t,w_state\n \t,w_country\n       ,d_year\n union all\n     select \n \tw_warehouse_name\n \t,w_warehouse_sq_ft\n \t,w_city\n \t,w_county\n \t,w_state\n \t,w_country\n \t,'MSC' || ',' || 'GERMA' as ship_carriers\n       ,d_year as year\n \t,sum(case when d_moy = 1 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as jan_sales\n \t,sum(case when d_moy = 2 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as feb_sales\n \t,sum(case when d_moy = 3 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as mar_sales\n \t,sum(case when d_moy = 4 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as apr_sales\n \t,sum(case when d_moy = 5 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as may_sales\n \t,sum(case when d_moy = 6 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as jun_sales\n \t,sum(case when d_moy = 7 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as jul_sales\n \t,sum(case when d_moy = 8 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as aug_sales\n \t,sum(case when d_moy = 9 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as sep_sales\n \t,sum(case when d_moy = 10 \n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as oct_sales\n \t,sum(case when d_moy = 11\n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as nov_sales\n \t,sum(case when d_moy = 12\n \t\tthen cs_ext_list_price* cs_quantity else 0 end) as dec_sales\n \t,sum(case when d_moy = 1 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as jan_net\n \t,sum(case when d_moy = 2 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as feb_net\n \t,sum(case when d_moy = 3 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as mar_net\n \t,sum(case when d_moy = 4 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as apr_net\n \t,sum(case when d_moy = 5 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as may_net\n \t,sum(case when d_moy = 6 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as jun_net\n \t,sum(case when d_moy = 7 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as jul_net\n \t,sum(case when d_moy = 8 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as aug_net\n \t,sum(case when d_moy = 9 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as sep_net\n \t,sum(case when d_moy = 10 \n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as oct_net\n \t,sum(case when d_moy = 11\n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as nov_net\n \t,sum(case when d_moy = 12\n \t\tthen cs_net_paid_inc_ship * cs_quantity else 0 end) as dec_net\n     from\n          catalog_sales\n         ,warehouse\n         ,date_dim\n         ,time_dim\n \t ,ship_mode\n     where\n            cs_warehouse_sk =  w_warehouse_sk\n        and cs_sold_date_sk = d_date_sk\n        and cs_sold_time_sk = t_time_sk\n \tand cs_ship_mode_sk = sm_ship_mode_sk\n        and d_year = 2001\n \tand t_time between 9453 AND 9453+28800 \n \tand sm_carrier in ('MSC','GERMA')\n     group by \n        w_warehouse_name\n \t,w_warehouse_sq_ft\n \t,w_city\n \t,w_county\n \t,w_state\n \t,w_country\n       ,d_year\n ) x\n group by \n        w_warehouse_name\n \t,w_warehouse_sq_ft\n \t,w_city\n \t,w_county\n \t,w_state\n \t,w_country\n \t,ship_carriers\n       ,year\n order by w_warehouse_name\n limit 100;\n\n-- end query 1 in stream 0 using template query66.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query67.sql",
    "content": "-- start query 1 in stream 0 using template query67.tpl and seed 1819994127\nselect  *\nfrom (select i_category\n            ,i_class\n            ,i_brand\n            ,i_product_name\n            ,d_year\n            ,d_qoy\n            ,d_moy\n            ,s_store_id\n            ,sumsales\n            ,rank() over (partition by i_category order by sumsales desc) rk\n      from (select i_category\n                  ,i_class\n                  ,i_brand\n                  ,i_product_name\n                  ,d_year\n                  ,d_qoy\n                  ,d_moy\n                  ,s_store_id\n                  ,sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales\n            from store_sales\n                ,date_dim\n                ,store\n                ,item\n       where  ss_sold_date_sk=d_date_sk\n          and ss_item_sk=i_item_sk\n          and ss_store_sk = s_store_sk\n          and d_month_seq between 1185 and 1185+11\n       group by  rollup(i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy,s_store_id))dw1) dw2\nwhere rk <= 100\norder by i_category\n        ,i_class\n        ,i_brand\n        ,i_product_name\n        ,d_year\n        ,d_qoy\n        ,d_moy\n        ,s_store_id\n        ,sumsales\n        ,rk\nlimit 100;\n\n-- end query 1 in stream 0 using template query67.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query68.sql",
    "content": "-- start query 1 in stream 0 using template query68.tpl and seed 803547492\nselect  c_last_name\n       ,c_first_name\n       ,ca_city\n       ,bought_city\n       ,ss_ticket_number\n       ,extended_price\n       ,extended_tax\n       ,list_price\n from (select ss_ticket_number\n             ,ss_customer_sk\n             ,ca_city bought_city\n             ,sum(ss_ext_sales_price) extended_price \n             ,sum(ss_ext_list_price) list_price\n             ,sum(ss_ext_tax) extended_tax \n       from store_sales\n           ,date_dim\n           ,store\n           ,household_demographics\n           ,customer_address \n       where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n         and store_sales.ss_store_sk = store.s_store_sk  \n        and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n        and store_sales.ss_addr_sk = customer_address.ca_address_sk\n        and date_dim.d_dom between 1 and 2 \n        and (household_demographics.hd_dep_count = 4 or\n             household_demographics.hd_vehicle_count= 0)\n        and date_dim.d_year in (1999,1999+1,1999+2)\n        and store.s_city in ('Pleasant Hill','Bethel')\n       group by ss_ticket_number\n               ,ss_customer_sk\n               ,ss_addr_sk,ca_city) dn\n      ,customer\n      ,customer_address current_addr\n where ss_customer_sk = c_customer_sk\n   and customer.c_current_addr_sk = current_addr.ca_address_sk\n   and current_addr.ca_city <> bought_city\n order by c_last_name\n         ,ss_ticket_number\n limit 100;\n\n-- end query 1 in stream 0 using template query68.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query69.sql",
    "content": "-- start query 1 in stream 0 using template query69.tpl and seed 797269820\nselect  \n  cd_gender,\n  cd_marital_status,\n  cd_education_status,\n  count(*) cnt1,\n  cd_purchase_estimate,\n  count(*) cnt2,\n  cd_credit_rating,\n  count(*) cnt3\n from\n  customer c,customer_address ca,customer_demographics\n where\n  c.c_current_addr_sk = ca.ca_address_sk and\n  ca_state in ('MO','MN','AZ') and\n  cd_demo_sk = c.c_current_cdemo_sk and \n  exists (select *\n          from store_sales,date_dim\n          where c.c_customer_sk = ss_customer_sk and\n                ss_sold_date_sk = d_date_sk and\n                d_year = 2003 and\n                d_moy between 2 and 2+2) and\n   (not exists (select *\n            from web_sales,date_dim\n            where c.c_customer_sk = ws_bill_customer_sk and\n                  ws_sold_date_sk = d_date_sk and\n                  d_year = 2003 and\n                  d_moy between 2 and 2+2) and\n    not exists (select * \n            from catalog_sales,date_dim\n            where c.c_customer_sk = cs_ship_customer_sk and\n                  cs_sold_date_sk = d_date_sk and\n                  d_year = 2003 and\n                  d_moy between 2 and 2+2))\n group by cd_gender,\n          cd_marital_status,\n          cd_education_status,\n          cd_purchase_estimate,\n          cd_credit_rating\n order by cd_gender,\n          cd_marital_status,\n          cd_education_status,\n          cd_purchase_estimate,\n          cd_credit_rating\n limit 100;\n\n-- end query 1 in stream 0 using template query69.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query7.sql",
    "content": "-- start query 1 in stream 0 using template query7.tpl and seed 1930872976\nselect  i_item_id, \n        avg(ss_quantity) agg1,\n        avg(ss_list_price) agg2,\n        avg(ss_coupon_amt) agg3,\n        avg(ss_sales_price) agg4 \n from store_sales, customer_demographics, date_dim, item, promotion\n where ss_sold_date_sk = d_date_sk and\n       ss_item_sk = i_item_sk and\n       ss_cdemo_sk = cd_demo_sk and\n       ss_promo_sk = p_promo_sk and\n       cd_gender = 'F' and \n       cd_marital_status = 'W' and\n       cd_education_status = 'College' and\n       (p_channel_email = 'N' or p_channel_event = 'N') and\n       d_year = 2001 \n group by i_item_id\n order by i_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query7.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query70.sql",
    "content": "-- start query 1 in stream 0 using template query70.tpl and seed 1819994127\nselect  \n    sum(ss_net_profit) as total_sum\n   ,s_state\n   ,s_county\n   ,grouping(s_state)+grouping(s_county) as lochierarchy\n   ,rank() over (\n \tpartition by grouping(s_state)+grouping(s_county),\n \tcase when grouping(s_county) = 0 then s_state end \n \torder by sum(ss_net_profit) desc) as rank_within_parent\n from\n    store_sales\n   ,date_dim       d1\n   ,store\n where\n    d1.d_month_seq between 1218 and 1218+11\n and d1.d_date_sk = ss_sold_date_sk\n and s_store_sk  = ss_store_sk\n and s_state in\n             ( select s_state\n               from  (select s_state as s_state,\n \t\t\t    rank() over ( partition by s_state order by sum(ss_net_profit) desc) as ranking\n                      from   store_sales, store, date_dim\n                      where  d_month_seq between 1218 and 1218+11\n \t\t\t    and d_date_sk = ss_sold_date_sk\n \t\t\t    and s_store_sk  = ss_store_sk\n                      group by s_state\n                     ) tmp1 \n               where ranking <= 5\n             )\n group by rollup(s_state,s_county)\n order by\n   lochierarchy desc\n  ,case when lochierarchy = 0 then s_state end\n  ,rank_within_parent\n limit 100;\n\n-- end query 1 in stream 0 using template query70.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query71.sql",
    "content": "-- start query 1 in stream 0 using template query71.tpl and seed 2031708268\nselect i_brand_id brand_id, i_brand brand,t_hour,t_minute,\n \tsum(ext_price) ext_price\n from item, (select ws_ext_sales_price as ext_price, \n                        ws_sold_date_sk as sold_date_sk,\n                        ws_item_sk as sold_item_sk,\n                        ws_sold_time_sk as time_sk  \n                 from web_sales,date_dim\n                 where d_date_sk = ws_sold_date_sk\n                   and d_moy=12\n                   and d_year=2000\n                 union all\n                 select cs_ext_sales_price as ext_price,\n                        cs_sold_date_sk as sold_date_sk,\n                        cs_item_sk as sold_item_sk,\n                        cs_sold_time_sk as time_sk\n                 from catalog_sales,date_dim\n                 where d_date_sk = cs_sold_date_sk\n                   and d_moy=12\n                   and d_year=2000\n                 union all\n                 select ss_ext_sales_price as ext_price,\n                        ss_sold_date_sk as sold_date_sk,\n                        ss_item_sk as sold_item_sk,\n                        ss_sold_time_sk as time_sk\n                 from store_sales,date_dim\n                 where d_date_sk = ss_sold_date_sk\n                   and d_moy=12\n                   and d_year=2000\n                 ) tmp,time_dim\n where\n   sold_item_sk = i_item_sk\n   and i_manager_id=1\n   and time_sk = t_time_sk\n   and (t_meal_time = 'breakfast' or t_meal_time = 'dinner')\n group by i_brand, i_brand_id,t_hour,t_minute\n order by ext_price desc, i_brand_id\n ;\n\n-- end query 1 in stream 0 using template query71.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query72.sql",
    "content": "-- start query 1 in stream 0 using template query72.tpl and seed 2031708268\nselect  i_item_desc\n      ,w_warehouse_name\n      ,d1.d_week_seq\n      ,sum(case when p_promo_sk is null then 1 else 0 end) no_promo\n      ,sum(case when p_promo_sk is not null then 1 else 0 end) promo\n      ,count(*) total_cnt\nfrom catalog_sales\njoin inventory on (cs_item_sk = inv_item_sk)\njoin warehouse on (w_warehouse_sk=inv_warehouse_sk)\njoin item on (i_item_sk = cs_item_sk)\njoin customer_demographics on (cs_bill_cdemo_sk = cd_demo_sk)\njoin household_demographics on (cs_bill_hdemo_sk = hd_demo_sk)\njoin date_dim d1 on (cs_sold_date_sk = d1.d_date_sk)\njoin date_dim d2 on (inv_date_sk = d2.d_date_sk)\njoin date_dim d3 on (cs_ship_date_sk = d3.d_date_sk)\nleft outer join promotion on (cs_promo_sk=p_promo_sk)\nleft outer join catalog_returns on (cr_item_sk = cs_item_sk and cr_order_number = cs_order_number)\nwhere d1.d_week_seq = d2.d_week_seq\n  and inv_quantity_on_hand < cs_quantity \n  and d3.d_date > d1.d_date + INTERVAL(5) DAY \n  and hd_buy_potential = '1001-5000'\n  and d1.d_year = 2000\n  and cd_marital_status = 'D'\ngroup by i_item_desc,w_warehouse_name,d1.d_week_seq\norder by total_cnt desc, i_item_desc, w_warehouse_name, d_week_seq\nlimit 100;\n\n-- end query 1 in stream 0 using template query72.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query73.sql",
    "content": "-- start query 1 in stream 0 using template query73.tpl and seed 1971067816\nselect c_last_name\n       ,c_first_name\n       ,c_salutation\n       ,c_preferred_cust_flag \n       ,ss_ticket_number\n       ,cnt from\n   (select ss_ticket_number\n          ,ss_customer_sk\n          ,count(*) cnt\n    from store_sales,date_dim,store,household_demographics\n    where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    and store_sales.ss_store_sk = store.s_store_sk  \n    and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    and date_dim.d_dom between 1 and 2 \n    and (household_demographics.hd_buy_potential = '>10000' or\n         household_demographics.hd_buy_potential = '5001-10000')\n    and household_demographics.hd_vehicle_count > 0\n    and case when household_demographics.hd_vehicle_count > 0 then \n             household_demographics.hd_dep_count/ household_demographics.hd_vehicle_count else null end > 1\n    and date_dim.d_year in (2000,2000+1,2000+2)\n    and store.s_county in ('Lea County','Furnas County','Pennington County','Bronx County')\n    group by ss_ticket_number,ss_customer_sk) dj,customer\n    where ss_customer_sk = c_customer_sk\n      and cnt between 1 and 5\n    order by cnt desc, c_last_name asc;\n\n-- end query 1 in stream 0 using template query73.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query74.sql",
    "content": "-- start query 1 in stream 0 using template query74.tpl and seed 1556717815\nwith year_total as (\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,d_year as year\n       ,sum(ss_net_paid) year_total\n       ,'s' sale_type\n from customer\n     ,store_sales\n     ,date_dim\n where c_customer_sk = ss_customer_sk\n   and ss_sold_date_sk = d_date_sk\n   and d_year in (1998,1998+1)\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,d_year\n union all\n select c_customer_id customer_id\n       ,c_first_name customer_first_name\n       ,c_last_name customer_last_name\n       ,d_year as year\n       ,sum(ws_net_paid) year_total\n       ,'w' sale_type\n from customer\n     ,web_sales\n     ,date_dim\n where c_customer_sk = ws_bill_customer_sk\n   and ws_sold_date_sk = d_date_sk\n   and d_year in (1998,1998+1)\n group by c_customer_id\n         ,c_first_name\n         ,c_last_name\n         ,d_year\n         )\n  select \n        t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name\n from year_total t_s_firstyear\n     ,year_total t_s_secyear\n     ,year_total t_w_firstyear\n     ,year_total t_w_secyear\n where t_s_secyear.customer_id = t_s_firstyear.customer_id\n         and t_s_firstyear.customer_id = t_w_secyear.customer_id\n         and t_s_firstyear.customer_id = t_w_firstyear.customer_id\n         and t_s_firstyear.sale_type = 's'\n         and t_w_firstyear.sale_type = 'w'\n         and t_s_secyear.sale_type = 's'\n         and t_w_secyear.sale_type = 'w'\n         and t_s_firstyear.year = 1998\n         and t_s_secyear.year = 1998+1\n         and t_w_firstyear.year = 1998\n         and t_w_secyear.year = 1998+1\n         and t_s_firstyear.year_total > 0\n         and t_w_firstyear.year_total > 0\n         and case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end\n           > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end\n order by 3,1,2\nlimit 100;\n\n-- end query 1 in stream 0 using template query74.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query75.sql",
    "content": "-- start query 1 in stream 0 using template query75.tpl and seed 1819994127\nWITH all_sales AS (\n SELECT d_year\n       ,i_brand_id\n       ,i_class_id\n       ,i_category_id\n       ,i_manufact_id\n       ,SUM(sales_cnt) AS sales_cnt\n       ,SUM(sales_amt) AS sales_amt\n FROM (SELECT d_year\n             ,i_brand_id\n             ,i_class_id\n             ,i_category_id\n             ,i_manufact_id\n             ,cs_quantity - COALESCE(cr_return_quantity,0) AS sales_cnt\n             ,cs_ext_sales_price - COALESCE(cr_return_amount,0.0) AS sales_amt\n       FROM catalog_sales JOIN item ON i_item_sk=cs_item_sk\n                          JOIN date_dim ON d_date_sk=cs_sold_date_sk\n                          LEFT JOIN catalog_returns ON (cs_order_number=cr_order_number \n                                                    AND cs_item_sk=cr_item_sk)\n       WHERE i_category='Sports'\n       UNION\n       SELECT d_year\n             ,i_brand_id\n             ,i_class_id\n             ,i_category_id\n             ,i_manufact_id\n             ,ss_quantity - COALESCE(sr_return_quantity,0) AS sales_cnt\n             ,ss_ext_sales_price - COALESCE(sr_return_amt,0.0) AS sales_amt\n       FROM store_sales JOIN item ON i_item_sk=ss_item_sk\n                        JOIN date_dim ON d_date_sk=ss_sold_date_sk\n                        LEFT JOIN store_returns ON (ss_ticket_number=sr_ticket_number \n                                                AND ss_item_sk=sr_item_sk)\n       WHERE i_category='Sports'\n       UNION\n       SELECT d_year\n             ,i_brand_id\n             ,i_class_id\n             ,i_category_id\n             ,i_manufact_id\n             ,ws_quantity - COALESCE(wr_return_quantity,0) AS sales_cnt\n             ,ws_ext_sales_price - COALESCE(wr_return_amt,0.0) AS sales_amt\n       FROM web_sales JOIN item ON i_item_sk=ws_item_sk\n                      JOIN date_dim ON d_date_sk=ws_sold_date_sk\n                      LEFT JOIN web_returns ON (ws_order_number=wr_order_number \n                                            AND ws_item_sk=wr_item_sk)\n       WHERE i_category='Sports') sales_detail\n GROUP BY d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id)\n SELECT  prev_yr.d_year AS prev_year\n                          ,curr_yr.d_year AS year\n                          ,curr_yr.i_brand_id\n                          ,curr_yr.i_class_id\n                          ,curr_yr.i_category_id\n                          ,curr_yr.i_manufact_id\n                          ,prev_yr.sales_cnt AS prev_yr_cnt\n                          ,curr_yr.sales_cnt AS curr_yr_cnt\n                          ,curr_yr.sales_cnt-prev_yr.sales_cnt AS sales_cnt_diff\n                          ,curr_yr.sales_amt-prev_yr.sales_amt AS sales_amt_diff\n FROM all_sales curr_yr, all_sales prev_yr\n WHERE curr_yr.i_brand_id=prev_yr.i_brand_id\n   AND curr_yr.i_class_id=prev_yr.i_class_id\n   AND curr_yr.i_category_id=prev_yr.i_category_id\n   AND curr_yr.i_manufact_id=prev_yr.i_manufact_id\n   AND curr_yr.d_year=2001\n   AND prev_yr.d_year=2001-1\n   AND CAST(curr_yr.sales_cnt AS DECIMAL(17,2))/CAST(prev_yr.sales_cnt AS DECIMAL(17,2))<0.9\n ORDER BY sales_cnt_diff,sales_amt_diff\n limit 100;\n\n-- end query 1 in stream 0 using template query75.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query76.sql",
    "content": "-- start query 1 in stream 0 using template query76.tpl and seed 2031708268\nselect  channel, col_name, d_year, d_qoy, i_category, COUNT(*) sales_cnt, SUM(ext_sales_price) sales_amt FROM (\n        SELECT 'store' as channel, 'ss_customer_sk' col_name, d_year, d_qoy, i_category, ss_ext_sales_price ext_sales_price\n         FROM store_sales, item, date_dim\n         WHERE ss_customer_sk IS NULL\n           AND ss_sold_date_sk=d_date_sk\n           AND ss_item_sk=i_item_sk\n        UNION ALL\n        SELECT 'web' as channel, 'ws_ship_addr_sk' col_name, d_year, d_qoy, i_category, ws_ext_sales_price ext_sales_price\n         FROM web_sales, item, date_dim\n         WHERE ws_ship_addr_sk IS NULL\n           AND ws_sold_date_sk=d_date_sk\n           AND ws_item_sk=i_item_sk\n        UNION ALL\n        SELECT 'catalog' as channel, 'cs_ship_mode_sk' col_name, d_year, d_qoy, i_category, cs_ext_sales_price ext_sales_price\n         FROM catalog_sales, item, date_dim\n         WHERE cs_ship_mode_sk IS NULL\n           AND cs_sold_date_sk=d_date_sk\n           AND cs_item_sk=i_item_sk) foo\nGROUP BY channel, col_name, d_year, d_qoy, i_category\nORDER BY channel, col_name, d_year, d_qoy, i_category\nlimit 100;\n\n-- end query 1 in stream 0 using template query76.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query77.sql",
    "content": "-- start query 1 in stream 0 using template query77.tpl and seed 1819994127\nwith ss as\n (select s_store_sk,\n         sum(ss_ext_sales_price) as sales,\n         sum(ss_net_profit) as profit\n from store_sales,\n      date_dim,\n      store\n where ss_sold_date_sk = d_date_sk\n       and d_date between cast('2000-08-16' as date) \n                  and (cast('2000-08-16' as date) +  30 days) \n       and ss_store_sk = s_store_sk\n group by s_store_sk)\n ,\n sr as\n (select s_store_sk,\n         sum(sr_return_amt) as returns,\n         sum(sr_net_loss) as profit_loss\n from store_returns,\n      date_dim,\n      store\n where sr_returned_date_sk = d_date_sk\n       and d_date between cast('2000-08-16' as date)\n                  and (cast('2000-08-16' as date) +  30 days)\n       and sr_store_sk = s_store_sk\n group by s_store_sk), \n cs as\n (select cs_call_center_sk,\n        sum(cs_ext_sales_price) as sales,\n        sum(cs_net_profit) as profit\n from catalog_sales,\n      date_dim\n where cs_sold_date_sk = d_date_sk\n       and d_date between cast('2000-08-16' as date)\n                  and (cast('2000-08-16' as date) +  30 days)\n group by cs_call_center_sk \n ), \n cr as\n (select cr_call_center_sk,\n         sum(cr_return_amount) as returns,\n         sum(cr_net_loss) as profit_loss\n from catalog_returns,\n      date_dim\n where cr_returned_date_sk = d_date_sk\n       and d_date between cast('2000-08-16' as date)\n                  and (cast('2000-08-16' as date) +  30 days)\n group by cr_call_center_sk\n ), \n ws as\n ( select wp_web_page_sk,\n        sum(ws_ext_sales_price) as sales,\n        sum(ws_net_profit) as profit\n from web_sales,\n      date_dim,\n      web_page\n where ws_sold_date_sk = d_date_sk\n       and d_date between cast('2000-08-16' as date)\n                  and (cast('2000-08-16' as date) +  30 days)\n       and ws_web_page_sk = wp_web_page_sk\n group by wp_web_page_sk), \n wr as\n (select wp_web_page_sk,\n        sum(wr_return_amt) as returns,\n        sum(wr_net_loss) as profit_loss\n from web_returns,\n      date_dim,\n      web_page\n where wr_returned_date_sk = d_date_sk\n       and d_date between cast('2000-08-16' as date)\n                  and (cast('2000-08-16' as date) +  30 days)\n       and wr_web_page_sk = wp_web_page_sk\n group by wp_web_page_sk)\n  select  channel\n        , id\n        , sum(sales) as sales\n        , sum(returns) as returns\n        , sum(profit) as profit\n from \n (select 'store channel' as channel\n        , ss.s_store_sk as id\n        , sales\n        , coalesce(returns, 0) as returns\n        , (profit - coalesce(profit_loss,0)) as profit\n from   ss left join sr\n        on  ss.s_store_sk = sr.s_store_sk\n union all\n select 'catalog channel' as channel\n        , cs_call_center_sk as id\n        , sales\n        , returns\n        , (profit - profit_loss) as profit\n from  cs\n       , cr\n union all\n select 'web channel' as channel\n        , ws.wp_web_page_sk as id\n        , sales\n        , coalesce(returns, 0) returns\n        , (profit - coalesce(profit_loss,0)) as profit\n from   ws left join wr\n        on  ws.wp_web_page_sk = wr.wp_web_page_sk\n ) x\n group by rollup (channel, id)\n order by channel\n         ,id\n limit 100;\n\n-- end query 1 in stream 0 using template query77.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query78.sql",
    "content": "-- start query 1 in stream 0 using template query78.tpl and seed 1819994127\nwith ws as\n  (select d_year AS ws_sold_year, ws_item_sk,\n    ws_bill_customer_sk ws_customer_sk,\n    sum(ws_quantity) ws_qty,\n    sum(ws_wholesale_cost) ws_wc,\n    sum(ws_sales_price) ws_sp\n   from web_sales\n   left join web_returns on wr_order_number=ws_order_number and ws_item_sk=wr_item_sk\n   join date_dim on ws_sold_date_sk = d_date_sk\n   where wr_order_number is null\n   group by d_year, ws_item_sk, ws_bill_customer_sk\n   ),\ncs as\n  (select d_year AS cs_sold_year, cs_item_sk,\n    cs_bill_customer_sk cs_customer_sk,\n    sum(cs_quantity) cs_qty,\n    sum(cs_wholesale_cost) cs_wc,\n    sum(cs_sales_price) cs_sp\n   from catalog_sales\n   left join catalog_returns on cr_order_number=cs_order_number and cs_item_sk=cr_item_sk\n   join date_dim on cs_sold_date_sk = d_date_sk\n   where cr_order_number is null\n   group by d_year, cs_item_sk, cs_bill_customer_sk\n   ),\nss as\n  (select d_year AS ss_sold_year, ss_item_sk,\n    ss_customer_sk,\n    sum(ss_quantity) ss_qty,\n    sum(ss_wholesale_cost) ss_wc,\n    sum(ss_sales_price) ss_sp\n   from store_sales\n   left join store_returns on sr_ticket_number=ss_ticket_number and ss_item_sk=sr_item_sk\n   join date_dim on ss_sold_date_sk = d_date_sk\n   where sr_ticket_number is null\n   group by d_year, ss_item_sk, ss_customer_sk\n   )\n select \nss_customer_sk,\nround(ss_qty/(coalesce(ws_qty,0)+coalesce(cs_qty,0)),2) ratio,\nss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price,\ncoalesce(ws_qty,0)+coalesce(cs_qty,0) other_chan_qty,\ncoalesce(ws_wc,0)+coalesce(cs_wc,0) other_chan_wholesale_cost,\ncoalesce(ws_sp,0)+coalesce(cs_sp,0) other_chan_sales_price\nfrom ss\nleft join ws on (ws_sold_year=ss_sold_year and ws_item_sk=ss_item_sk and ws_customer_sk=ss_customer_sk)\nleft join cs on (cs_sold_year=ss_sold_year and cs_item_sk=ss_item_sk and cs_customer_sk=ss_customer_sk)\nwhere (coalesce(ws_qty,0)>0 or coalesce(cs_qty, 0)>0) and ss_sold_year=2001\norder by \n  ss_customer_sk,\n  ss_qty desc, ss_wc desc, ss_sp desc,\n  other_chan_qty,\n  other_chan_wholesale_cost,\n  other_chan_sales_price,\n  ratio\nlimit 100;\n\n-- end query 1 in stream 0 using template query78.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query79.sql",
    "content": "-- start query 1 in stream 0 using template query79.tpl and seed 2031708268\nselect \n  c_last_name,c_first_name,substr(s_city,1,30),ss_ticket_number,amt,profit\n  from\n   (select ss_ticket_number\n          ,ss_customer_sk\n          ,store.s_city\n          ,sum(ss_coupon_amt) amt\n          ,sum(ss_net_profit) profit\n    from store_sales,date_dim,store,household_demographics\n    where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    and store_sales.ss_store_sk = store.s_store_sk  \n    and store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    and (household_demographics.hd_dep_count = 0 or household_demographics.hd_vehicle_count > 3)\n    and date_dim.d_dow = 1\n    and date_dim.d_year in (1998,1998+1,1998+2) \n    and store.s_number_employees between 200 and 295\n    group by ss_ticket_number,ss_customer_sk,ss_addr_sk,store.s_city) ms,customer\n    where ss_customer_sk = c_customer_sk\n order by c_last_name,c_first_name,substr(s_city,1,30), profit\nlimit 100;\n\n-- end query 1 in stream 0 using template query79.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query8.sql",
    "content": "-- start query 1 in stream 0 using template query8.tpl and seed 1766988859\nselect  s_store_name\n      ,sum(ss_net_profit)\n from store_sales\n     ,date_dim\n     ,store,\n     (select ca_zip\n     from (\n      SELECT substr(ca_zip,1,5) ca_zip\n      FROM customer_address\n      WHERE substr(ca_zip,1,5) IN (\n                          '47602','16704','35863','28577','83910','36201',\n                          '58412','48162','28055','41419','80332',\n                          '38607','77817','24891','16226','18410',\n                          '21231','59345','13918','51089','20317',\n                          '17167','54585','67881','78366','47770',\n                          '18360','51717','73108','14440','21800',\n                          '89338','45859','65501','34948','25973',\n                          '73219','25333','17291','10374','18829',\n                          '60736','82620','41351','52094','19326',\n                          '25214','54207','40936','21814','79077',\n                          '25178','75742','77454','30621','89193',\n                          '27369','41232','48567','83041','71948',\n                          '37119','68341','14073','16891','62878',\n                          '49130','19833','24286','27700','40979',\n                          '50412','81504','94835','84844','71954',\n                          '39503','57649','18434','24987','12350',\n                          '86379','27413','44529','98569','16515',\n                          '27287','24255','21094','16005','56436',\n                          '91110','68293','56455','54558','10298',\n                          '83647','32754','27052','51766','19444',\n                          '13869','45645','94791','57631','20712',\n                          '37788','41807','46507','21727','71836',\n                          '81070','50632','88086','63991','20244',\n                          '31655','51782','29818','63792','68605',\n                          '94898','36430','57025','20601','82080',\n                          '33869','22728','35834','29086','92645',\n                          '98584','98072','11652','78093','57553',\n                          '43830','71144','53565','18700','90209',\n                          '71256','38353','54364','28571','96560',\n                          '57839','56355','50679','45266','84680',\n                          '34306','34972','48530','30106','15371',\n                          '92380','84247','92292','68852','13338',\n                          '34594','82602','70073','98069','85066',\n                          '47289','11686','98862','26217','47529',\n                          '63294','51793','35926','24227','14196',\n                          '24594','32489','99060','49472','43432',\n                          '49211','14312','88137','47369','56877',\n                          '20534','81755','15794','12318','21060',\n                          '73134','41255','63073','81003','73873',\n                          '66057','51184','51195','45676','92696',\n                          '70450','90669','98338','25264','38919',\n                          '59226','58581','60298','17895','19489',\n                          '52301','80846','95464','68770','51634',\n                          '19988','18367','18421','11618','67975',\n                          '25494','41352','95430','15734','62585',\n                          '97173','33773','10425','75675','53535',\n                          '17879','41967','12197','67998','79658',\n                          '59130','72592','14851','43933','68101',\n                          '50636','25717','71286','24660','58058',\n                          '72991','95042','15543','33122','69280',\n                          '11912','59386','27642','65177','17672',\n                          '33467','64592','36335','54010','18767',\n                          '63193','42361','49254','33113','33159',\n                          '36479','59080','11855','81963','31016',\n                          '49140','29392','41836','32958','53163',\n                          '13844','73146','23952','65148','93498',\n                          '14530','46131','58454','13376','13378',\n                          '83986','12320','17193','59852','46081',\n                          '98533','52389','13086','68843','31013',\n                          '13261','60560','13443','45533','83583',\n                          '11489','58218','19753','22911','25115',\n                          '86709','27156','32669','13123','51933',\n                          '39214','41331','66943','14155','69998',\n                          '49101','70070','35076','14242','73021',\n                          '59494','15782','29752','37914','74686',\n                          '83086','34473','15751','81084','49230',\n                          '91894','60624','17819','28810','63180',\n                          '56224','39459','55233','75752','43639',\n                          '55349','86057','62361','50788','31830',\n                          '58062','18218','85761','60083','45484',\n                          '21204','90229','70041','41162','35390',\n                          '16364','39500','68908','26689','52868',\n                          '81335','40146','11340','61527','61794',\n                          '71997','30415','59004','29450','58117',\n                          '69952','33562','83833','27385','61860',\n                          '96435','48333','23065','32961','84919',\n                          '61997','99132','22815','56600','68730',\n                          '48017','95694','32919','88217','27116',\n                          '28239','58032','18884','16791','21343',\n                          '97462','18569','75660','15475')\n     intersect\n      select ca_zip\n      from (SELECT substr(ca_zip,1,5) ca_zip,count(*) cnt\n            FROM customer_address, customer\n            WHERE ca_address_sk = c_current_addr_sk and\n                  c_preferred_cust_flag='Y'\n            group by ca_zip\n            having count(*) > 10)A1)A2) V1\n where ss_store_sk = s_store_sk\n  and ss_sold_date_sk = d_date_sk\n  and d_qoy = 2 and d_year = 1998\n  and (substr(s_zip,1,2) = substr(V1.ca_zip,1,2))\n group by s_store_name\n order by s_store_name\n limit 100;\n\n-- end query 1 in stream 0 using template query8.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query80.sql",
    "content": "-- start query 1 in stream 0 using template query80.tpl and seed 1819994127\nwith ssr as\n (select  s_store_id as store_id,\n          sum(ss_ext_sales_price) as sales,\n          sum(coalesce(sr_return_amt, 0)) as returns,\n          sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit\n  from store_sales left outer join store_returns on\n         (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number),\n     date_dim,\n     store,\n     item,\n     promotion\n where ss_sold_date_sk = d_date_sk\n       and d_date between cast('2002-08-06' as date) \n                  and (cast('2002-08-06' as date) +  30 days)\n       and ss_store_sk = s_store_sk\n       and ss_item_sk = i_item_sk\n       and i_current_price > 50\n       and ss_promo_sk = p_promo_sk\n       and p_channel_tv = 'N'\n group by s_store_id)\n ,\n csr as\n (select  cp_catalog_page_id as catalog_page_id,\n          sum(cs_ext_sales_price) as sales,\n          sum(coalesce(cr_return_amount, 0)) as returns,\n          sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit\n  from catalog_sales left outer join catalog_returns on\n         (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number),\n     date_dim,\n     catalog_page,\n     item,\n     promotion\n where cs_sold_date_sk = d_date_sk\n       and d_date between cast('2002-08-06' as date)\n                  and (cast('2002-08-06' as date) +  30 days)\n        and cs_catalog_page_sk = cp_catalog_page_sk\n       and cs_item_sk = i_item_sk\n       and i_current_price > 50\n       and cs_promo_sk = p_promo_sk\n       and p_channel_tv = 'N'\ngroup by cp_catalog_page_id)\n ,\n wsr as\n (select  web_site_id,\n          sum(ws_ext_sales_price) as sales,\n          sum(coalesce(wr_return_amt, 0)) as returns,\n          sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit\n  from web_sales left outer join web_returns on\n         (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number),\n     date_dim,\n     web_site,\n     item,\n     promotion\n where ws_sold_date_sk = d_date_sk\n       and d_date between cast('2002-08-06' as date)\n                  and (cast('2002-08-06' as date) +  30 days)\n        and ws_web_site_sk = web_site_sk\n       and ws_item_sk = i_item_sk\n       and i_current_price > 50\n       and ws_promo_sk = p_promo_sk\n       and p_channel_tv = 'N'\ngroup by web_site_id)\n  select  channel\n        , id\n        , sum(sales) as sales\n        , sum(returns) as returns\n        , sum(profit) as profit\n from \n (select 'store channel' as channel\n        , 'store' || store_id as id\n        , sales\n        , returns\n        , profit\n from   ssr\n union all\n select 'catalog channel' as channel\n        , 'catalog_page' || catalog_page_id as id\n        , sales\n        , returns\n        , profit\n from  csr\n union all\n select 'web channel' as channel\n        , 'web_site' || web_site_id as id\n        , sales\n        , returns\n        , profit\n from   wsr\n ) x\n group by rollup (channel, id)\n order by channel\n         ,id\n limit 100;\n\n-- end query 1 in stream 0 using template query80.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query81.sql",
    "content": "-- start query 1 in stream 0 using template query81.tpl and seed 1819994127\nwith customer_total_return as\n (select cr_returning_customer_sk as ctr_customer_sk\n        ,ca_state as ctr_state, \n \tsum(cr_return_amt_inc_tax) as ctr_total_return\n from catalog_returns\n     ,date_dim\n     ,customer_address\n where cr_returned_date_sk = d_date_sk \n   and d_year =1998\n   and cr_returning_addr_sk = ca_address_sk \n group by cr_returning_customer_sk\n         ,ca_state )\n  select  c_customer_id,c_salutation,c_first_name,c_last_name,ca_street_number,ca_street_name\n                   ,ca_street_type,ca_suite_number,ca_city,ca_county,ca_state,ca_zip,ca_country,ca_gmt_offset\n                  ,ca_location_type,ctr_total_return\n from customer_total_return ctr1\n     ,customer_address\n     ,customer\n where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2\n \t\t\t  from customer_total_return ctr2 \n                  \t  where ctr1.ctr_state = ctr2.ctr_state)\n       and ca_address_sk = c_current_addr_sk\n       and ca_state = 'TX'\n       and ctr1.ctr_customer_sk = c_customer_sk\n order by c_customer_id,c_salutation,c_first_name,c_last_name,ca_street_number,ca_street_name\n                   ,ca_street_type,ca_suite_number,ca_city,ca_county,ca_state,ca_zip,ca_country,ca_gmt_offset\n                  ,ca_location_type,ctr_total_return\n limit 100;\n\n-- end query 1 in stream 0 using template query81.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query82.sql",
    "content": "-- start query 1 in stream 0 using template query82.tpl and seed 55585014\nselect  i_item_id\n       ,i_item_desc\n       ,i_current_price\n from item, inventory, date_dim, store_sales\n where i_current_price between 49 and 49+30\n and inv_item_sk = i_item_sk\n and d_date_sk=inv_date_sk\n and d_date between cast('2001-01-28' as date) and (cast('2001-01-28' as date) +  60 days)\n and i_manufact_id in (80,675,292,17)\n and inv_quantity_on_hand between 100 and 500\n and ss_item_sk = i_item_sk\n group by i_item_id,i_item_desc,i_current_price\n order by i_item_id\n limit 100;\n\n-- end query 1 in stream 0 using template query82.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query83.sql",
    "content": "-- start query 1 in stream 0 using template query83.tpl and seed 1930872976\nwith sr_items as\n (select i_item_id item_id,\n        sum(sr_return_quantity) sr_item_qty\n from store_returns,\n      item,\n      date_dim\n where sr_item_sk = i_item_sk\n and   d_date    in \n\t(select d_date\n\tfrom date_dim\n\twhere d_week_seq in \n\t\t(select d_week_seq\n\t\tfrom date_dim\n\t  where d_date in ('2000-06-17','2000-08-22','2000-11-17')))\n and   sr_returned_date_sk   = d_date_sk\n group by i_item_id),\n cr_items as\n (select i_item_id item_id,\n        sum(cr_return_quantity) cr_item_qty\n from catalog_returns,\n      item,\n      date_dim\n where cr_item_sk = i_item_sk\n and   d_date    in \n\t(select d_date\n\tfrom date_dim\n\twhere d_week_seq in \n\t\t(select d_week_seq\n\t\tfrom date_dim\n\t  where d_date in ('2000-06-17','2000-08-22','2000-11-17')))\n and   cr_returned_date_sk   = d_date_sk\n group by i_item_id),\n wr_items as\n (select i_item_id item_id,\n        sum(wr_return_quantity) wr_item_qty\n from web_returns,\n      item,\n      date_dim\n where wr_item_sk = i_item_sk\n and   d_date    in \n\t(select d_date\n\tfrom date_dim\n\twhere d_week_seq in \n\t\t(select d_week_seq\n\t\tfrom date_dim\n\t\twhere d_date in ('2000-06-17','2000-08-22','2000-11-17')))\n and   wr_returned_date_sk   = d_date_sk\n group by i_item_id)\n  select  sr_items.item_id\n       ,sr_item_qty\n       ,sr_item_qty/(sr_item_qty+cr_item_qty+wr_item_qty)/3.0 * 100 sr_dev\n       ,cr_item_qty\n       ,cr_item_qty/(sr_item_qty+cr_item_qty+wr_item_qty)/3.0 * 100 cr_dev\n       ,wr_item_qty\n       ,wr_item_qty/(sr_item_qty+cr_item_qty+wr_item_qty)/3.0 * 100 wr_dev\n       ,(sr_item_qty+cr_item_qty+wr_item_qty)/3.0 average\n from sr_items\n     ,cr_items\n     ,wr_items\n where sr_items.item_id=cr_items.item_id\n   and sr_items.item_id=wr_items.item_id \n order by sr_items.item_id\n         ,sr_item_qty\n limit 100;\n\n-- end query 1 in stream 0 using template query83.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query84.sql",
    "content": "-- start query 1 in stream 0 using template query84.tpl and seed 1819994127\nselect  c_customer_id as customer_id\n       , coalesce(c_last_name,'') || ', ' || coalesce(c_first_name,'') as customername\n from customer\n     ,customer_address\n     ,customer_demographics\n     ,household_demographics\n     ,income_band\n     ,store_returns\n where ca_city\t        =  'Hopewell'\n   and c_current_addr_sk = ca_address_sk\n   and ib_lower_bound   >=  37855\n   and ib_upper_bound   <=  37855 + 50000\n   and ib_income_band_sk = hd_income_band_sk\n   and cd_demo_sk = c_current_cdemo_sk\n   and hd_demo_sk = c_current_hdemo_sk\n   and sr_cdemo_sk = cd_demo_sk\n order by c_customer_id\n limit 100;\n\n-- end query 1 in stream 0 using template query84.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query85.sql",
    "content": "-- start query 1 in stream 0 using template query85.tpl and seed 622697896\nselect  substr(r_reason_desc,1,20)\n       ,avg(ws_quantity)\n       ,avg(wr_refunded_cash)\n       ,avg(wr_fee)\n from web_sales, web_returns, web_page, customer_demographics cd1,\n      customer_demographics cd2, customer_address, date_dim, reason \n where ws_web_page_sk = wp_web_page_sk\n   and ws_item_sk = wr_item_sk\n   and ws_order_number = wr_order_number\n   and ws_sold_date_sk = d_date_sk and d_year = 2001\n   and cd1.cd_demo_sk = wr_refunded_cdemo_sk \n   and cd2.cd_demo_sk = wr_returning_cdemo_sk\n   and ca_address_sk = wr_refunded_addr_sk\n   and r_reason_sk = wr_reason_sk\n   and\n   (\n    (\n     cd1.cd_marital_status = 'M'\n     and\n     cd1.cd_marital_status = cd2.cd_marital_status\n     and\n     cd1.cd_education_status = '4 yr Degree'\n     and \n     cd1.cd_education_status = cd2.cd_education_status\n     and\n     ws_sales_price between 100.00 and 150.00\n    )\n   or\n    (\n     cd1.cd_marital_status = 'S'\n     and\n     cd1.cd_marital_status = cd2.cd_marital_status\n     and\n     cd1.cd_education_status = 'College' \n     and\n     cd1.cd_education_status = cd2.cd_education_status\n     and\n     ws_sales_price between 50.00 and 100.00\n    )\n   or\n    (\n     cd1.cd_marital_status = 'D'\n     and\n     cd1.cd_marital_status = cd2.cd_marital_status\n     and\n     cd1.cd_education_status = 'Secondary'\n     and\n     cd1.cd_education_status = cd2.cd_education_status\n     and\n     ws_sales_price between 150.00 and 200.00\n    )\n   )\n   and\n   (\n    (\n     ca_country = 'United States'\n     and\n     ca_state in ('TX', 'VA', 'CA')\n     and ws_net_profit between 100 and 200  \n    )\n    or\n    (\n     ca_country = 'United States'\n     and\n     ca_state in ('AR', 'NE', 'MO')\n     and ws_net_profit between 150 and 300  \n    )\n    or\n    (\n     ca_country = 'United States'\n     and\n     ca_state in ('IA', 'MS', 'WA')\n     and ws_net_profit between 50 and 250  \n    )\n   )\ngroup by r_reason_desc\norder by substr(r_reason_desc,1,20)\n        ,avg(ws_quantity)\n        ,avg(wr_refunded_cash)\n        ,avg(wr_fee)\nlimit 100;\n\n-- end query 1 in stream 0 using template query85.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query86.sql",
    "content": "-- start query 1 in stream 0 using template query86.tpl and seed 1819994127\nselect   \n    sum(ws_net_paid) as total_sum\n   ,i_category\n   ,i_class\n   ,grouping(i_category)+grouping(i_class) as lochierarchy\n   ,rank() over (\n \tpartition by grouping(i_category)+grouping(i_class),\n \tcase when grouping(i_class) = 0 then i_category end \n \torder by sum(ws_net_paid) desc) as rank_within_parent\n from\n    web_sales\n   ,date_dim       d1\n   ,item\n where\n    d1.d_month_seq between 1215 and 1215+11\n and d1.d_date_sk = ws_sold_date_sk\n and i_item_sk  = ws_item_sk\n group by rollup(i_category,i_class)\n order by\n   lochierarchy desc,\n   case when lochierarchy = 0 then i_category end,\n   rank_within_parent\n limit 100;\n\n-- end query 1 in stream 0 using template query86.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query87.sql",
    "content": "-- start query 1 in stream 0 using template query87.tpl and seed 1819994127\nselect count(*) \nfrom ((select distinct c_last_name, c_first_name, d_date\n       from store_sales, date_dim, customer\n       where store_sales.ss_sold_date_sk = date_dim.d_date_sk\n         and store_sales.ss_customer_sk = customer.c_customer_sk\n         and d_month_seq between 1221 and 1221+11)\n       except\n      (select distinct c_last_name, c_first_name, d_date\n       from catalog_sales, date_dim, customer\n       where catalog_sales.cs_sold_date_sk = date_dim.d_date_sk\n         and catalog_sales.cs_bill_customer_sk = customer.c_customer_sk\n         and d_month_seq between 1221 and 1221+11)\n       except\n      (select distinct c_last_name, c_first_name, d_date\n       from web_sales, date_dim, customer\n       where web_sales.ws_sold_date_sk = date_dim.d_date_sk\n         and web_sales.ws_bill_customer_sk = customer.c_customer_sk\n         and d_month_seq between 1221 and 1221+11)\n) cool_cust\n;\n\n-- end query 1 in stream 0 using template query87.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query88.sql",
    "content": "-- start query 1 in stream 0 using template query88.tpl and seed 318176889\nselect  *\nfrom\n (select count(*) h8_30_to_9\n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk   \n     and ss_hdemo_sk = household_demographics.hd_demo_sk \n     and ss_store_sk = s_store_sk\n     and time_dim.t_hour = 8\n     and time_dim.t_minute >= 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2)) \n     and store.s_store_name = 'ese') s1,\n (select count(*) h9_to_9_30 \n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk \n     and time_dim.t_hour = 9 \n     and time_dim.t_minute < 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s2,\n (select count(*) h9_30_to_10 \n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk\n     and time_dim.t_hour = 9\n     and time_dim.t_minute >= 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s3,\n (select count(*) h10_to_10_30\n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk\n     and time_dim.t_hour = 10 \n     and time_dim.t_minute < 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s4,\n (select count(*) h10_30_to_11\n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk\n     and time_dim.t_hour = 10 \n     and time_dim.t_minute >= 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s5,\n (select count(*) h11_to_11_30\n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk \n     and time_dim.t_hour = 11\n     and time_dim.t_minute < 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s6,\n (select count(*) h11_30_to_12\n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk\n     and time_dim.t_hour = 11\n     and time_dim.t_minute >= 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s7,\n (select count(*) h12_to_12_30\n from store_sales, household_demographics , time_dim, store\n where ss_sold_time_sk = time_dim.t_time_sk\n     and ss_hdemo_sk = household_demographics.hd_demo_sk\n     and ss_store_sk = s_store_sk\n     and time_dim.t_hour = 12\n     and time_dim.t_minute < 30\n     and ((household_demographics.hd_dep_count = 2 and household_demographics.hd_vehicle_count<=2+2) or\n          (household_demographics.hd_dep_count = 4 and household_demographics.hd_vehicle_count<=4+2) or\n          (household_demographics.hd_dep_count = 3 and household_demographics.hd_vehicle_count<=3+2))\n     and store.s_store_name = 'ese') s8\n;\n\n-- end query 1 in stream 0 using template query88.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query89.sql",
    "content": "-- start query 1 in stream 0 using template query89.tpl and seed 1719819282\nselect  *\nfrom(\nselect i_category, i_class, i_brand,\n       s_store_name, s_company_name,\n       d_moy,\n       sum(ss_sales_price) sum_sales,\n       avg(sum(ss_sales_price)) over\n         (partition by i_category, i_brand, s_store_name, s_company_name)\n         avg_monthly_sales\nfrom item, store_sales, date_dim, store\nwhere ss_item_sk = i_item_sk and\n      ss_sold_date_sk = d_date_sk and\n      ss_store_sk = s_store_sk and\n      d_year in (2000) and\n        ((i_category in ('Home','Music','Books') and\n          i_class in ('glassware','classical','fiction')\n         )\n      or (i_category in ('Jewelry','Sports','Women') and\n          i_class in ('semi-precious','baseball','dresses') \n        ))\ngroup by i_category, i_class, i_brand,\n         s_store_name, s_company_name, d_moy) tmp1\nwhere case when (avg_monthly_sales <> 0) then (abs(sum_sales - avg_monthly_sales) / avg_monthly_sales) else null end > 0.1\norder by sum_sales - avg_monthly_sales, s_store_name\nlimit 100;\n\n-- end query 1 in stream 0 using template query89.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query9.sql",
    "content": "-- start query 1 in stream 0 using template query9.tpl and seed 1490436826\nselect case when (select count(*) \n                  from store_sales \n                  where ss_quantity between 1 and 20) > 98972190\n            then (select avg(ss_ext_discount_amt) \n                  from store_sales \n                  where ss_quantity between 1 and 20) \n            else (select avg(ss_net_profit)\n                  from store_sales\n                  where ss_quantity between 1 and 20) end bucket1 ,\n       case when (select count(*)\n                  from store_sales\n                  where ss_quantity between 21 and 40) > 160856845\n            then (select avg(ss_ext_discount_amt)\n                  from store_sales\n                  where ss_quantity between 21 and 40) \n            else (select avg(ss_net_profit)\n                  from store_sales\n                  where ss_quantity between 21 and 40) end bucket2,\n       case when (select count(*)\n                  from store_sales\n                  where ss_quantity between 41 and 60) > 12733327\n            then (select avg(ss_ext_discount_amt)\n                  from store_sales\n                  where ss_quantity between 41 and 60)\n            else (select avg(ss_net_profit)\n                  from store_sales\n                  where ss_quantity between 41 and 60) end bucket3,\n       case when (select count(*)\n                  from store_sales\n                  where ss_quantity between 61 and 80) > 96251173\n            then (select avg(ss_ext_discount_amt)\n                  from store_sales\n                  where ss_quantity between 61 and 80)\n            else (select avg(ss_net_profit)\n                  from store_sales\n                  where ss_quantity between 61 and 80) end bucket4,\n       case when (select count(*)\n                  from store_sales\n                  where ss_quantity between 81 and 100) > 80049606\n            then (select avg(ss_ext_discount_amt)\n                  from store_sales\n                  where ss_quantity between 81 and 100)\n            else (select avg(ss_net_profit)\n                  from store_sales\n                  where ss_quantity between 81 and 100) end bucket5\nfrom reason\nwhere r_reason_sk = 1\n;\n\n-- end query 1 in stream 0 using template query9.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query90.sql",
    "content": "-- start query 1 in stream 0 using template query90.tpl and seed 2031708268\nselect  cast(amc as decimal(15,4))/cast(pmc as decimal(15,4)) am_pm_ratio\n from ( select count(*) amc\n       from web_sales, household_demographics , time_dim, web_page\n       where ws_sold_time_sk = time_dim.t_time_sk\n         and ws_ship_hdemo_sk = household_demographics.hd_demo_sk\n         and ws_web_page_sk = web_page.wp_web_page_sk\n         and time_dim.t_hour between 9 and 9+1\n         and household_demographics.hd_dep_count = 3\n         and web_page.wp_char_count between 5000 and 5200) at,\n      ( select count(*) pmc\n       from web_sales, household_demographics , time_dim, web_page\n       where ws_sold_time_sk = time_dim.t_time_sk\n         and ws_ship_hdemo_sk = household_demographics.hd_demo_sk\n         and ws_web_page_sk = web_page.wp_web_page_sk\n         and time_dim.t_hour between 16 and 16+1\n         and household_demographics.hd_dep_count = 3\n         and web_page.wp_char_count between 5000 and 5200) pt\n order by am_pm_ratio\n limit 100;\n\n-- end query 1 in stream 0 using template query90.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query91.sql",
    "content": "-- start query 1 in stream 0 using template query91.tpl and seed 1930872976\nselect  \n        cc_call_center_id Call_Center,\n        cc_name Call_Center_Name,\n        cc_manager Manager,\n        sum(cr_net_loss) Returns_Loss\nfrom\n        call_center,\n        catalog_returns,\n        date_dim,\n        customer,\n        customer_address,\n        customer_demographics,\n        household_demographics\nwhere\n        cr_call_center_sk       = cc_call_center_sk\nand     cr_returned_date_sk     = d_date_sk\nand     cr_returning_customer_sk= c_customer_sk\nand     cd_demo_sk              = c_current_cdemo_sk\nand     hd_demo_sk              = c_current_hdemo_sk\nand     ca_address_sk           = c_current_addr_sk\nand     d_year                  = 2000 \nand     d_moy                   = 12\nand     ( (cd_marital_status       = 'M' and cd_education_status     = 'Unknown')\n        or(cd_marital_status       = 'W' and cd_education_status     = 'Advanced Degree'))\nand     hd_buy_potential like 'Unknown%'\nand     ca_gmt_offset           = -7\ngroup by cc_call_center_id,cc_name,cc_manager,cd_marital_status,cd_education_status\norder by sum(cr_net_loss) desc;\n\n-- end query 1 in stream 0 using template query91.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query92.sql",
    "content": "-- start query 1 in stream 0 using template query92.tpl and seed 2031708268\nselect  \n   sum(ws_ext_discount_amt)  as `Excess Discount Amount` \nfrom \n    web_sales \n   ,item \n   ,date_dim\nwhere\ni_manufact_id = 356\nand i_item_sk = ws_item_sk \nand d_date between '2001-03-12' and \n        (cast('2001-03-12' as date) + 90 days)\nand d_date_sk = ws_sold_date_sk \nand ws_ext_discount_amt  \n     > ( \n         SELECT \n            1.3 * avg(ws_ext_discount_amt) \n         FROM \n            web_sales \n           ,date_dim\n         WHERE \n              ws_item_sk = i_item_sk \n          and d_date between '2001-03-12' and\n                             (cast('2001-03-12' as date) + 90 days)\n          and d_date_sk = ws_sold_date_sk \n      ) \norder by sum(ws_ext_discount_amt)\nlimit 100;\n\n-- end query 1 in stream 0 using template query92.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query93.sql",
    "content": "-- start query 1 in stream 0 using template query93.tpl and seed 1200409435\nselect  ss_customer_sk\n            ,sum(act_sales) sumsales\n      from (select ss_item_sk\n                  ,ss_ticket_number\n                  ,ss_customer_sk\n                  ,case when sr_return_quantity is not null then (ss_quantity-sr_return_quantity)*ss_sales_price\n                                                            else (ss_quantity*ss_sales_price) end act_sales\n            from store_sales left outer join store_returns on (sr_item_sk = ss_item_sk\n                                                               and sr_ticket_number = ss_ticket_number)\n                ,reason\n            where sr_reason_sk = r_reason_sk\n              and r_reason_desc = 'reason 66') t\n      group by ss_customer_sk\n      order by sumsales, ss_customer_sk\nlimit 100;\n\n-- end query 1 in stream 0 using template query93.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query94.sql",
    "content": "-- start query 1 in stream 0 using template query94.tpl and seed 2031708268\nselect  \n   count(distinct ws_order_number) as `order count`\n  ,sum(ws_ext_ship_cost) as `total shipping cost`\n  ,sum(ws_net_profit) as `total net profit`\nfrom\n   web_sales ws1\n  ,date_dim\n  ,customer_address\n  ,web_site\nwhere\n    d_date between '1999-4-01' and \n           (cast('1999-4-01' as date) + 60 days)\nand ws1.ws_ship_date_sk = d_date_sk\nand ws1.ws_ship_addr_sk = ca_address_sk\nand ca_state = 'NE'\nand ws1.ws_web_site_sk = web_site_sk\nand web_company_name = 'pri'\nand exists (select *\n            from web_sales ws2\n            where ws1.ws_order_number = ws2.ws_order_number\n              and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)\nand not exists(select *\n               from web_returns wr1\n               where ws1.ws_order_number = wr1.wr_order_number)\norder by count(distinct ws_order_number)\nlimit 100;\n\n-- end query 1 in stream 0 using template query94.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query95.sql",
    "content": "-- start query 1 in stream 0 using template query95.tpl and seed 2031708268\nwith ws_wh as\n(select ws1.ws_order_number,ws1.ws_warehouse_sk wh1,ws2.ws_warehouse_sk wh2\n from web_sales ws1,web_sales ws2\n where ws1.ws_order_number = ws2.ws_order_number\n   and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)\n select  \n   count(distinct ws_order_number) as `order count`\n  ,sum(ws_ext_ship_cost) as `total shipping cost`\n  ,sum(ws_net_profit) as `total net profit`\nfrom\n   web_sales ws1\n  ,date_dim\n  ,customer_address\n  ,web_site\nwhere\n    d_date between '2002-4-01' and \n           (cast('2002-4-01' as date) + 60 days)\nand ws1.ws_ship_date_sk = d_date_sk\nand ws1.ws_ship_addr_sk = ca_address_sk\nand ca_state = 'AL'\nand ws1.ws_web_site_sk = web_site_sk\nand web_company_name = 'pri'\nand ws1.ws_order_number in (select ws_order_number\n                            from ws_wh)\nand ws1.ws_order_number in (select wr_order_number\n                            from web_returns,ws_wh\n                            where wr_order_number = ws_wh.ws_order_number)\norder by count(distinct ws_order_number)\nlimit 100;\n\n-- end query 1 in stream 0 using template query95.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query96.sql",
    "content": "-- start query 1 in stream 0 using template query96.tpl and seed 1819994127\nselect  count(*) \nfrom store_sales\n    ,household_demographics \n    ,time_dim, store\nwhere ss_sold_time_sk = time_dim.t_time_sk   \n    and ss_hdemo_sk = household_demographics.hd_demo_sk \n    and ss_store_sk = s_store_sk\n    and time_dim.t_hour = 16\n    and time_dim.t_minute >= 30\n    and household_demographics.hd_dep_count = 6\n    and store.s_store_name = 'ese'\norder by count(*)\nlimit 100;\n\n-- end query 1 in stream 0 using template query96.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query97.sql",
    "content": "-- start query 1 in stream 0 using template query97.tpl and seed 1819994127\nwith ssci as (\nselect ss_customer_sk customer_sk\n      ,ss_item_sk item_sk\nfrom store_sales,date_dim\nwhere ss_sold_date_sk = d_date_sk\n  and d_month_seq between 1190 and 1190 + 11\ngroup by ss_customer_sk\n        ,ss_item_sk),\ncsci as(\n select cs_bill_customer_sk customer_sk\n      ,cs_item_sk item_sk\nfrom catalog_sales,date_dim\nwhere cs_sold_date_sk = d_date_sk\n  and d_month_seq between 1190 and 1190 + 11\ngroup by cs_bill_customer_sk\n        ,cs_item_sk)\n select  sum(case when ssci.customer_sk is not null and csci.customer_sk is null then 1 else 0 end) store_only\n      ,sum(case when ssci.customer_sk is null and csci.customer_sk is not null then 1 else 0 end) catalog_only\n      ,sum(case when ssci.customer_sk is not null and csci.customer_sk is not null then 1 else 0 end) store_and_catalog\nfrom ssci full outer join csci on (ssci.customer_sk=csci.customer_sk\n                               and ssci.item_sk = csci.item_sk)\nlimit 100;\n\n-- end query 1 in stream 0 using template query97.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query98.sql",
    "content": "-- start query 1 in stream 0 using template query98.tpl and seed 345591136\nselect i_item_id\n      ,i_item_desc \n      ,i_category \n      ,i_class \n      ,i_current_price\n      ,sum(ss_ext_sales_price) as itemrevenue \n      ,sum(ss_ext_sales_price)*100/sum(sum(ss_ext_sales_price)) over\n          (partition by i_class) as revenueratio\nfrom\t\n\tstore_sales\n    \t,item \n    \t,date_dim\nwhere \n\tss_item_sk = i_item_sk \n  \tand i_category in ('Home', 'Sports', 'Men')\n  \tand ss_sold_date_sk = d_date_sk\n\tand d_date between cast('2002-01-05' as date) \n\t\t\t\tand (cast('2002-01-05' as date) + 30 days)\ngroup by \n\ti_item_id\n        ,i_item_desc \n        ,i_category\n        ,i_class\n        ,i_current_price\norder by \n\ti_category\n        ,i_class\n        ,i_item_id\n        ,i_item_desc\n        ,revenueratio;\n\n-- end query 1 in stream 0 using template query98.tpl\n"
  },
  {
    "path": "sample-queries-tpcds/query99.sql",
    "content": "-- start query 1 in stream 0 using template query99.tpl and seed 1819994127\nselect  \n   substr(w_warehouse_name,1,20)\n  ,sm_type\n  ,cc_name\n  ,sum(case when (cs_ship_date_sk - cs_sold_date_sk <= 30 ) then 1 else 0 end)  as `30 days` \n  ,sum(case when (cs_ship_date_sk - cs_sold_date_sk > 30) and \n                 (cs_ship_date_sk - cs_sold_date_sk <= 60) then 1 else 0 end )  as `31-60 days` \n  ,sum(case when (cs_ship_date_sk - cs_sold_date_sk > 60) and \n                 (cs_ship_date_sk - cs_sold_date_sk <= 90) then 1 else 0 end)  as `61-90 days` \n  ,sum(case when (cs_ship_date_sk - cs_sold_date_sk > 90) and\n                 (cs_ship_date_sk - cs_sold_date_sk <= 120) then 1 else 0 end)  as `91-120 days` \n  ,sum(case when (cs_ship_date_sk - cs_sold_date_sk  > 120) then 1 else 0 end)  as `>120 days` \nfrom\n   catalog_sales\n  ,warehouse\n  ,ship_mode\n  ,call_center\n  ,date_dim\nwhere\n    d_month_seq between 1178 and 1178 + 11\nand cs_ship_date_sk   = d_date_sk\nand cs_warehouse_sk   = w_warehouse_sk\nand cs_ship_mode_sk   = sm_ship_mode_sk\nand cs_call_center_sk = cc_call_center_sk\ngroup by\n   substr(w_warehouse_name,1,20)\n  ,sm_type\n  ,cc_name\norder by substr(w_warehouse_name,1,20)\n        ,sm_type\n        ,cc_name\nlimit 100;\n\n-- end query 1 in stream 0 using template query99.tpl\n"
  },
  {
    "path": "sample-queries-tpch/README.md",
    "content": "Sample TPC-H Queries\n====================\n\nThis directory contains sample TPC-H queries you can run once you have generated your data. Queries are compatible with Apache Hive 13 and up.\n"
  },
  {
    "path": "sample-queries-tpch/testbench-withATS.settings",
    "content": "set ambari.hive.db.schema.name=hive;\nset fs.file.impl.disable.cache=true;\nset fs.hdfs.impl.disable.cache=true;\nset hive.auto.convert.join.noconditionaltask=true;\nset hive.auto.convert.join=true;\nset hive.auto.convert.sortmerge.join=true;\nset hive.compactor.abortedtxn.threshold=1000;\nset hive.compactor.check.interval=300;\nset hive.compactor.delta.num.threshold=10;\nset hive.compactor.delta.pct.threshold=0.1f;\nset hive.compactor.initiator.on=false;\nset hive.compactor.worker.threads=0;\nset hive.compactor.worker.timeout=86400;\nset hive.compute.query.using.stats=true;\nset hive.enforce.bucketing=true;\nset hive.enforce.sorting=true;\nset hive.enforce.sortmergebucketmapjoin=true;\nset hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook;\nset hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook;\nset hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook;\nset hive.execution.engine=mr;\nset hive.limit.pushdown.memory.usage=0.04;\nset hive.map.aggr=true;\nset hive.mapjoin.bucket.cache.size=10000;\nset hive.mapred.reduce.tasks.speculative.execution=false;\nset hive.metastore.cache.pinobjtypes=Table,Database,Type,FieldSchema,Order;\nset hive.metastore.client.socket.timeout=60;\nset hive.metastore.execute.setugi=true;\nset hive.metastore.warehouse.dir=/apps/hive/warehouse;\nset hive.optimize.bucketmapjoin.sortedmerge=false;\nset hive.optimize.bucketmapjoin=true;\nset hive.optimize.index.filter=true;\nset hive.optimize.reducededuplication.min.reducer=4;\nset hive.optimize.reducededuplication=true;\nset hive.orc.splits.include.file.footer=false;\nset hive.security.authorization.enabled=false;\nset hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider;\nset hive.server2.enable.doAs=false;\nset hive.server2.tez.default.queues=default;\nset hive.server2.tez.initialize.default.sessions=false;\nset hive.server2.tez.sessions.per.default.queue=1;\nset hive.stats.autogather=true;\nset hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;\nset hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;\nset hive.txn.max.open.batch=1000;\nset hive.txn.timeout=300;\nset hive.vectorized.execution.enabled=true;\nset hive.vectorized.groupby.checkinterval=1024;\nset hive.vectorized.groupby.flush.percent=1;\nset hive.vectorized.groupby.maxentries=1024;\n\n-- These values need to be tuned appropriately to your cluster. These examples are for reference.\n-- set hive.tez.container.size=4096;\n-- set hive.tez.java.opts=-Xmx3800m;\n-- set hive.auto.convert.join.noconditionaltask.size=1252698795;\n"
  },
  {
    "path": "sample-queries-tpch/testbench.settings",
    "content": "set ambari.hive.db.schema.name=hive;\nset fs.file.impl.disable.cache=true;\nset fs.hdfs.impl.disable.cache=true;\nset hive.auto.convert.join.noconditionaltask=true;\nset hive.auto.convert.join=true;\nset hive.auto.convert.sortmerge.join=true;\nset hive.compactor.abortedtxn.threshold=1000;\nset hive.compactor.check.interval=300;\nset hive.compactor.delta.num.threshold=10;\nset hive.compactor.delta.pct.threshold=0.1f;\nset hive.compactor.initiator.on=false;\nset hive.compactor.worker.threads=0;\nset hive.compactor.worker.timeout=86400;\nset hive.compute.query.using.stats=true;\nset hive.enforce.bucketing=true;\nset hive.enforce.sorting=true;\nset hive.enforce.sortmergebucketmapjoin=true;\nset hive.execution.engine=mr;\nset hive.limit.pushdown.memory.usage=0.04;\nset hive.map.aggr=true;\nset hive.mapjoin.bucket.cache.size=10000;\nset hive.mapred.reduce.tasks.speculative.execution=false;\nset hive.metastore.cache.pinobjtypes=Table,Database,Type,FieldSchema,Order;\nset hive.metastore.client.socket.timeout=60;\nset hive.metastore.execute.setugi=true;\nset hive.metastore.warehouse.dir=/apps/hive/warehouse;\nset hive.optimize.bucketmapjoin.sortedmerge=false;\nset hive.optimize.bucketmapjoin=true;\nset hive.optimize.index.filter=true;\nset hive.optimize.reducededuplication.min.reducer=4;\nset hive.optimize.reducededuplication=true;\nset hive.orc.splits.include.file.footer=false;\nset hive.security.authorization.enabled=false;\nset hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider;\nset hive.server2.enable.doAs=false;\nset hive.server2.tez.default.queues=default;\nset hive.server2.tez.initialize.default.sessions=false;\nset hive.server2.tez.sessions.per.default.queue=1;\nset hive.stats.autogather=true;\nset hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;\nset hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;\nset hive.txn.max.open.batch=1000;\nset hive.txn.timeout=300;\nset hive.vectorized.execution.enabled=true;\nset hive.vectorized.groupby.checkinterval=1024;\nset hive.vectorized.groupby.flush.percent=1;\nset hive.vectorized.groupby.maxentries=1024;\n\n-- These values need to be tuned appropriately to your cluster. These examples are for reference.\n-- set hive.tez.container.size=4096;\n-- set hive.tez.java.opts=-Xmx3800m;\n-- set hive.auto.convert.join.noconditionaltask.size=1252698795;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query1.sql",
    "content": "select\n\tl_returnflag,\n\tl_linestatus,\n\tsum(l_quantity) as sum_qty,\n\tsum(l_extendedprice) as sum_base_price,\n\tsum(l_extendedprice * (1 - l_discount)) as sum_disc_price,\n\tsum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,\n\tavg(l_quantity) as avg_qty,\n\tavg(l_extendedprice) as avg_price,\n\tavg(l_discount) as avg_disc,\n\tcount(*) as count_order\nfrom\n\tlineitem\nwhere\n\tl_shipdate <= '1998-09-16'\ngroup by\n\tl_returnflag,\n\tl_linestatus\norder by\n\tl_returnflag,\n\tl_linestatus;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query10.sql",
    "content": "select\n\tc_custkey,\n\tc_name,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\tc_acctbal,\n\tn_name,\n\tc_address,\n\tc_phone,\n\tc_comment\nfrom\n\tcustomer,\n\torders,\n\tlineitem,\n\tnation\nwhere\n\tc_custkey = o_custkey\n\tand l_orderkey = o_orderkey\n\tand o_orderdate >= '1993-07-01'\n\tand o_orderdate < '1993-10-01'\n\tand l_returnflag = 'R'\n\tand c_nationkey = n_nationkey\ngroup by\n\tc_custkey,\n\tc_name,\n\tc_acctbal,\n\tc_phone,\n\tn_name,\n\tc_address,\n\tc_comment\norder by\n\trevenue desc\nlimit 20;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query11.sql",
    "content": "drop view q11_part_tmp_cached;\ndrop view q11_sum_tmp_cached;\n\ncreate view q11_part_tmp_cached as\nselect\n\tps_partkey,\n\tsum(ps_supplycost * ps_availqty) as part_value\nfrom\n\tpartsupp,\n\tsupplier,\n\tnation\nwhere\n\tps_suppkey = s_suppkey\n\tand s_nationkey = n_nationkey\n\tand n_name = 'GERMANY'\ngroup by ps_partkey;\n\ncreate view q11_sum_tmp_cached as\nselect\n\tsum(part_value) as total_value\nfrom\n\tq11_part_tmp_cached;\n\nselect\n\tps_partkey, part_value as value\nfrom (\n\tselect\n\t\tps_partkey,\n\t\tpart_value,\n\t\ttotal_value\n\tfrom\n\t\tq11_part_tmp_cached join q11_sum_tmp_cached\n) a\nwhere\n\tpart_value > total_value * 0.0001\norder by\n\tvalue desc;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query12.sql",
    "content": "select\n\tl_shipmode,\n\tsum(case\n\t\twhen o_orderpriority = '1-URGENT'\n\t\t\tor o_orderpriority = '2-HIGH'\n\t\t\tthen 1\n\t\telse 0\n\tend) as high_line_count,\n\tsum(case\n\t\twhen o_orderpriority <> '1-URGENT'\n\t\t\tand o_orderpriority <> '2-HIGH'\n\t\t\tthen 1\n\t\telse 0\n\tend) as low_line_count\nfrom\n\torders,\n\tlineitem\nwhere\n\to_orderkey = l_orderkey\n\tand l_shipmode in ('REG AIR', 'MAIL')\n\tand l_commitdate < l_receiptdate\n\tand l_shipdate < l_commitdate\n\tand l_receiptdate >= '1995-01-01'\n\tand l_receiptdate < '1996-01-01'\ngroup by\n\tl_shipmode\norder by\n\tl_shipmode;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query13.sql",
    "content": "select\n\tc_count,\n\tcount(*) as custdist\nfrom\n\t(\n\t\tselect\n\t\t\tc_custkey,\n\t\t\tcount(o_orderkey) as c_count\n\t\tfrom\n\t\t\tcustomer left outer join orders on\n\t\t\t\tc_custkey = o_custkey\n\t\t\t\tand o_comment not like '%unusual%accounts%'\n\t\tgroup by\n\t\t\tc_custkey\n\t) c_orders\ngroup by\n\tc_count\norder by\n\tcustdist desc,\n\tc_count desc;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query14.sql",
    "content": "select\n\t100.00 * sum(case\n\t\twhen p_type like 'PROMO%'\n\t\t\tthen l_extendedprice * (1 - l_discount)\n\t\telse 0\n\tend) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue\nfrom\n\tlineitem,\n\tpart\nwhere\n\tl_partkey = p_partkey\n\tand l_shipdate >= '1995-08-01'\n\tand l_shipdate < '1995-09-01';\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query15.sql",
    "content": "drop view revenue_cached;\ndrop view max_revenue_cached;\n\ncreate view revenue_cached as\nselect\n\tl_suppkey as supplier_no,\n\tsum(l_extendedprice * (1 - l_discount)) as total_revenue\nfrom\n\tlineitem\nwhere\n\tl_shipdate >= '1996-01-01'\n\tand l_shipdate < '1996-04-01'\ngroup by l_suppkey;\n\ncreate view max_revenue_cached as\nselect\n\tmax(total_revenue) as max_revenue\nfrom\n\trevenue_cached;\n\nselect\n\ts_suppkey,\n\ts_name,\n\ts_address,\n\ts_phone,\n\ttotal_revenue\nfrom\n\tsupplier,\n\trevenue_cached,\n\tmax_revenue_cached\nwhere\n\ts_suppkey = supplier_no\n\tand total_revenue = max_revenue \norder by s_suppkey;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query16.sql",
    "content": "select\n\tp_brand,\n\tp_type,\n\tp_size,\n\tcount(distinct ps_suppkey) as supplier_cnt\nfrom\n\tpartsupp,\n\tpart\nwhere\n\tp_partkey = ps_partkey\n\tand p_brand <> 'Brand#34'\n\tand p_type not like 'ECONOMY BRUSHED%'\n\tand p_size in (22, 14, 27, 49, 21, 33, 35, 28)\n\tand partsupp.ps_suppkey not in (\n\t\tselect\n\t\t\ts_suppkey\n\t\tfrom\n\t\t\tsupplier\n\t\twhere\n\t\t\ts_comment like '%Customer%Complaints%'\n\t)\ngroup by\n\tp_brand,\n\tp_type,\n\tp_size\norder by\n\tsupplier_cnt desc,\n\tp_brand,\n\tp_type,\n\tp_size;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query17.sql",
    "content": "with q17_part as (\n  select p_partkey from part where  \n  p_brand = 'Brand#23'\n  and p_container = 'MED BOX'\n),\nq17_avg as (\n  select l_partkey as t_partkey, 0.2 * avg(l_quantity) as t_avg_quantity\n  from lineitem \n  where l_partkey IN (select p_partkey from q17_part)\n  group by l_partkey\n),\nq17_price as (\n  select\n  l_quantity,\n  l_partkey,\n  l_extendedprice\n  from\n  lineitem\n  where\n  l_partkey IN (select p_partkey from q17_part)\n)\nselect cast(sum(l_extendedprice) / 7.0 as decimal(32,2)) as avg_yearly\nfrom q17_avg, q17_price\nwhere \nt_partkey = l_partkey and l_quantity < t_avg_quantity;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query18.sql",
    "content": "drop view q18_tmp_cached;\ndrop table q18_large_volume_customer_cached;\n\ncreate view q18_tmp_cached as\nselect\n\tl_orderkey,\n\tsum(l_quantity) as t_sum_quantity\nfrom\n\tlineitem\nwhere\n\tl_orderkey is not null\ngroup by\n\tl_orderkey;\n\ncreate table q18_large_volume_customer_cached as\nselect\n\tc_name,\n\tc_custkey,\n\to_orderkey,\n\to_orderdate,\n\to_totalprice,\n\tsum(l_quantity)\nfrom\n\tcustomer,\n\torders,\n\tq18_tmp_cached t,\n\tlineitem l\nwhere\n\tc_custkey = o_custkey\n\tand o_orderkey = t.l_orderkey\n\tand o_orderkey is not null\n\tand t.t_sum_quantity > 300\n\tand o_orderkey = l.l_orderkey\n\tand l.l_orderkey is not null\ngroup by\n\tc_name,\n\tc_custkey,\n\to_orderkey,\n\to_orderdate,\n\to_totalprice\norder by\n\to_totalprice desc,\n\to_orderdate \nlimit 100;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query19.sql",
    "content": "select\n\tsum(l_extendedprice* (1 - l_discount)) as revenue\nfrom\n\tlineitem,\n\tpart\nwhere\n\t(\n\t\tp_partkey = l_partkey\n\t\tand p_brand = 'Brand#32'\n\t\tand p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')\n\t\tand l_quantity >= 7 and l_quantity <= 7 + 10\n\t\tand p_size between 1 and 5\n\t\tand l_shipmode in ('AIR', 'AIR REG')\n\t\tand l_shipinstruct = 'DELIVER IN PERSON'\n\t)\n\tor\n\t(\n\t\tp_partkey = l_partkey\n\t\tand p_brand = 'Brand#35'\n\t\tand p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')\n\t\tand l_quantity >= 15 and l_quantity <= 15 + 10\n\t\tand p_size between 1 and 10\n\t\tand l_shipmode in ('AIR', 'AIR REG')\n\t\tand l_shipinstruct = 'DELIVER IN PERSON'\n\t)\n\tor\n\t(\n\t\tp_partkey = l_partkey\n\t\tand p_brand = 'Brand#24'\n\t\tand p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')\n\t\tand l_quantity >= 26 and l_quantity <= 26 + 10\n\t\tand p_size between 1 and 15\n\t\tand l_shipmode in ('AIR', 'AIR REG')\n\t\tand l_shipinstruct = 'DELIVER IN PERSON'\n\t);\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query2.sql",
    "content": "drop view q2_min_ps_supplycost;\ncreate view q2_min_ps_supplycost as\nselect\n\tp_partkey as min_p_partkey,\n\tmin(ps_supplycost) as min_ps_supplycost\nfrom\n\tpart,\n\tpartsupp,\n\tsupplier,\n\tnation,\n\tregion\nwhere\n\tp_partkey = ps_partkey\n\tand s_suppkey = ps_suppkey\n\tand s_nationkey = n_nationkey\n\tand n_regionkey = r_regionkey\n\tand r_name = 'EUROPE'\ngroup by\n\tp_partkey;\n\nselect\n\ts_acctbal,\n\ts_name,\n\tn_name,\n\tp_partkey,\n\tp_mfgr,\n\ts_address,\n\ts_phone,\n\ts_comment\nfrom\n\tpart,\n\tsupplier,\n\tpartsupp,\n\tnation,\n\tregion,\n\tq2_min_ps_supplycost\nwhere\n\tp_partkey = ps_partkey\n\tand s_suppkey = ps_suppkey\n\tand p_size = 37\n\tand p_type like '%COPPER'\n\tand s_nationkey = n_nationkey\n\tand n_regionkey = r_regionkey\n\tand r_name = 'EUROPE'\n\tand ps_supplycost = min_ps_supplycost\n\tand p_partkey = min_p_partkey\norder by\n\ts_acctbal desc,\n\tn_name,\n\ts_name,\n\tp_partkey\nlimit 100;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query20.sql",
    "content": "-- explain formatted \nwith tmp1 as (\n    select p_partkey from part where p_name like 'forest%'\n),\ntmp2 as (\n    select s_name, s_address, s_suppkey\n    from supplier, nation\n    where s_nationkey = n_nationkey\n    and n_name = 'CANADA'\n),\ntmp3 as (\n    select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey\n    from lineitem, tmp2\n    where l_shipdate >= '1994-01-01' and l_shipdate <= '1995-01-01'\n    and l_suppkey = s_suppkey \n    group by l_partkey, l_suppkey\n),\ntmp4 as (\n    select ps_partkey, ps_suppkey, ps_availqty\n    from partsupp \n    where ps_partkey IN (select p_partkey from tmp1)\n),\ntmp5 as (\nselect\n    ps_suppkey\nfrom\n    tmp4, tmp3\nwhere\n    ps_partkey = l_partkey\n    and ps_suppkey = l_suppkey\n    and ps_availqty > sum_quantity\n)\nselect\n    s_name,\n    s_address\nfrom\n    supplier\nwhere\n    s_suppkey IN (select ps_suppkey from tmp5)\norder by s_name;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query21.sql",
    "content": "-- explain\n\ncreate temporary table l3 stored as orc as \nselect l_orderkey, count(distinct l_suppkey) as cntSupp\nfrom lineitem\nwhere l_receiptdate > l_commitdate and l_orderkey is not null\ngroup by l_orderkey\nhaving cntSupp = 1\n;\n\nwith location as (\nselect supplier.* from supplier, nation where\ns_nationkey = n_nationkey and n_name = 'SAUDI ARABIA'\n)\nselect s_name, count(*) as numwait\nfrom\n(\nselect li.l_suppkey, li.l_orderkey\nfrom lineitem li join orders o on li.l_orderkey = o.o_orderkey and\n                      o.o_orderstatus = 'F'\n     join\n     (\n     select l_orderkey, count(distinct l_suppkey) as cntSupp\n     from lineitem\n     group by l_orderkey\n     ) l2 on li.l_orderkey = l2.l_orderkey and \n             li.l_receiptdate > li.l_commitdate and \n             l2.cntSupp > 1\n) l1 join l3 on l1.l_orderkey = l3.l_orderkey\n join location s on l1.l_suppkey = s.s_suppkey\ngroup by\n s_name\norder by\n numwait desc,\n s_name\nlimit 100;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query22.sql",
    "content": "drop view q22_customer_tmp_cached;\ndrop view q22_customer_tmp1_cached;\ndrop view q22_orders_tmp_cached;\n\ncreate view if not exists q22_customer_tmp_cached as\nselect\n\tc_acctbal,\n\tc_custkey,\n\tsubstr(c_phone, 1, 2) as cntrycode\nfrom\n\tcustomer\nwhere\n\tsubstr(c_phone, 1, 2) = '13' or\n\tsubstr(c_phone, 1, 2) = '31' or\n\tsubstr(c_phone, 1, 2) = '23' or\n\tsubstr(c_phone, 1, 2) = '29' or\n\tsubstr(c_phone, 1, 2) = '30' or\n\tsubstr(c_phone, 1, 2) = '18' or\n\tsubstr(c_phone, 1, 2) = '17';\n \ncreate view if not exists q22_customer_tmp1_cached as\nselect\n\tavg(c_acctbal) as avg_acctbal\nfrom\n\tq22_customer_tmp_cached\nwhere\n\tc_acctbal > 0.00;\n\ncreate view if not exists q22_orders_tmp_cached as\nselect\n\to_custkey\nfrom\n\torders\ngroup by\n\to_custkey;\n\nselect\n\tcntrycode,\n\tcount(1) as numcust,\n\tsum(c_acctbal) as totacctbal\nfrom (\n\tselect\n\t\tcntrycode,\n\t\tc_acctbal,\n\t\tavg_acctbal\n\tfrom\n\t\tq22_customer_tmp1_cached ct1 join (\n\t\t\tselect\n\t\t\t\tcntrycode,\n\t\t\t\tc_acctbal\n\t\t\tfrom\n\t\t\t\tq22_orders_tmp_cached ot\n\t\t\t\tright outer join q22_customer_tmp_cached ct\n\t\t\t\ton ct.c_custkey = ot.o_custkey\n\t\t\twhere\n\t\t\t\to_custkey is null\n\t\t) ct2\n) a\nwhere\n\tc_acctbal > avg_acctbal\ngroup by\n\tcntrycode\norder by\n\tcntrycode;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query3.sql",
    "content": "select\n\tl_orderkey,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue,\n\to_orderdate,\n\to_shippriority\nfrom\n\tcustomer,\n\torders,\n\tlineitem\nwhere\n\tc_mktsegment = 'BUILDING'\n\tand c_custkey = o_custkey\n\tand l_orderkey = o_orderkey\n\tand o_orderdate < '1995-03-22'\n\tand l_shipdate > '1995-03-22'\ngroup by\n\tl_orderkey,\n\to_orderdate,\n\to_shippriority\norder by\n\trevenue desc,\n\to_orderdate\nlimit 10;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query4.sql",
    "content": "select\n\to_orderpriority,\n\tcount(*) as order_count\nfrom\n\torders as o\nwhere\n\to_orderdate >= '1996-05-01'\n\tand o_orderdate < '1996-08-01'\n\tand exists (\n\t\tselect\n\t\t\t*\n\t\tfrom\n\t\t\tlineitem\n\t\twhere\n\t\t\tl_orderkey = o.o_orderkey\n\t\t\tand l_commitdate < l_receiptdate\n\t)\ngroup by\n\to_orderpriority\norder by\n\to_orderpriority;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query5.sql",
    "content": "select\n\tn_name,\n\tsum(l_extendedprice * (1 - l_discount)) as revenue\nfrom\n\tcustomer,\n\torders,\n\tlineitem,\n\tsupplier,\n\tnation,\n\tregion\nwhere\n\tc_custkey = o_custkey\n\tand l_orderkey = o_orderkey\n\tand l_suppkey = s_suppkey\n\tand c_nationkey = s_nationkey\n\tand s_nationkey = n_nationkey\n\tand n_regionkey = r_regionkey\n\tand r_name = 'AFRICA'\n\tand o_orderdate >= '1993-01-01'\n\tand o_orderdate < '1994-01-01'\ngroup by\n\tn_name\norder by\n\trevenue desc;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query6.sql",
    "content": "select\n\tsum(l_extendedprice * l_discount) as revenue\nfrom\n\tlineitem\nwhere\n\tl_shipdate >= '1993-01-01'\n\tand l_shipdate < '1994-01-01'\n\tand l_discount between 0.06 - 0.01 and 0.06 + 0.01\n\tand l_quantity < 25;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query7.sql",
    "content": "select\n\tsupp_nation,\n\tcust_nation,\n\tl_year,\n\tsum(volume) as revenue\nfrom\n\t(\n\t\tselect\n\t\t\tn1.n_name as supp_nation,\n\t\t\tn2.n_name as cust_nation,\n\t\t\tyear(l_shipdate) as l_year,\n\t\t\tl_extendedprice * (1 - l_discount) as volume\n\t\tfrom\n\t\t\tsupplier,\n\t\t\tlineitem,\n\t\t\torders,\n\t\t\tcustomer,\n\t\t\tnation n1,\n\t\t\tnation n2\n\t\twhere\n\t\t\ts_suppkey = l_suppkey\n\t\t\tand o_orderkey = l_orderkey\n\t\t\tand c_custkey = o_custkey\n\t\t\tand s_nationkey = n1.n_nationkey\n\t\t\tand c_nationkey = n2.n_nationkey\n\t\t\tand (\n\t\t\t\t(n1.n_name = 'KENYA' and n2.n_name = 'PERU')\n\t\t\t\tor (n1.n_name = 'PERU' and n2.n_name = 'KENYA')\n\t\t\t)\n\t\t\tand l_shipdate between '1995-01-01' and '1996-12-31'\n\t) as shipping\ngroup by\n\tsupp_nation,\n\tcust_nation,\n\tl_year\norder by\n\tsupp_nation,\n\tcust_nation,\n\tl_year;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query8.sql",
    "content": "select\n\to_year,\n\tsum(case\n\t\twhen nation = 'PERU' then volume\n\t\telse 0\n\tend) / sum(volume) as mkt_share\nfrom\n\t(\n\t\tselect\n\t\t\tyear(o_orderdate) as o_year,\n\t\t\tl_extendedprice * (1 - l_discount) as volume,\n\t\t\tn2.n_name as nation\n\t\tfrom\n\t\t\tpart,\n\t\t\tsupplier,\n\t\t\tlineitem,\n\t\t\torders,\n\t\t\tcustomer,\n\t\t\tnation n1,\n\t\t\tnation n2,\n\t\t\tregion\n\t\twhere\n\t\t\tp_partkey = l_partkey\n\t\t\tand s_suppkey = l_suppkey\n\t\t\tand l_orderkey = o_orderkey\n\t\t\tand o_custkey = c_custkey\n\t\t\tand c_nationkey = n1.n_nationkey\n\t\t\tand n1.n_regionkey = r_regionkey\n\t\t\tand r_name = 'AMERICA'\n\t\t\tand s_nationkey = n2.n_nationkey\n\t\t\tand o_orderdate between '1995-01-01' and '1996-12-31'\n\t\t\tand p_type = 'ECONOMY BURNISHED NICKEL'\n\t) as all_nations\ngroup by\n\to_year\norder by\n\to_year;\n"
  },
  {
    "path": "sample-queries-tpch/tpch_query9.sql",
    "content": "select\n\tnation,\n\to_year,\n\tsum(amount) as sum_profit\nfrom\n\t(\n\t\tselect\n\t\t\tn_name as nation,\n\t\t\tyear(o_orderdate) as o_year,\n\t\t\tl_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount\n\t\tfrom\n\t\t\tpart,\n\t\t\tsupplier,\n\t\t\tlineitem,\n\t\t\tpartsupp,\n\t\t\torders,\n\t\t\tnation\n\t\twhere\n\t\t\ts_suppkey = l_suppkey\n\t\t\tand ps_suppkey = l_suppkey\n\t\t\tand ps_partkey = l_partkey\n\t\t\tand p_partkey = l_partkey\n\t\t\tand o_orderkey = l_orderkey\n\t\t\tand s_nationkey = n_nationkey\n\t\t\tand p_name like '%plum%'\n\t) as profit\ngroup by\n\tnation,\n\to_year\norder by\n\tnation,\n\to_year desc;\n"
  },
  {
    "path": "settings/init.sql",
    "content": "set hive.map.aggr=true;\nset mapreduce.reduce.speculative=false;\nset hive.auto.convert.join=true;\nset hive.optimize.reducededuplication.min.reducer=1;\nset hive.optimize.mapjoin.mapreduce=true;\nset hive.stats.autogather=true;\n\nset mapred.reduce.parallel.copies=30;\n-- set mapred.job.shuffle.input.buffer.percent=0.5;\n-- set mapred.job.reduce.input.buffer.percent=0.2;\nset mapred.map.child.java.opts=-server -Xmx2800m -Djava.net.preferIPv4Stack=true;\nset mapred.reduce.child.java.opts=-server -Xmx3800m -Djava.net.preferIPv4Stack=true;\nset mapreduce.map.memory.mb=3072;\nset mapreduce.reduce.memory.mb=4096;\nset hive.llap.memory.oversubscription.max.executors.per.query=8;\nset hive.llap.mapjoin.memory.oversubscribe.factor=0.3;\nset hive.auto.convert.join.hashtable.max.entries=-1;\nset hive.optimize.bucketmapjoin=false;\nset hive.convert.join.bucket.mapjoin.tez=false;\nset hive.auto.convert.join.shuffle.max.size=10000000000;\nset hive.tez.llap.min.reducer.per.executor=0.33;\nset hive.map.aggr.hash.min.reduction=0.99;\n\nset hive.optimize.sort.dynamic.partition.threshold=0;\n"
  },
  {
    "path": "settings/load-flat.sql",
    "content": "--set hive.enforce.bucketing=true;\n--set hive.enforce.sorting=true;\nset hive.exec.dynamic.partition.mode=nonstrict;\nset hive.exec.max.dynamic.partitions.pernode=1000000;\nset hive.exec.max.dynamic.partitions=1000000;\nset hive.exec.max.created.files=1000000;\n\n-- set mapreduce.input.fileinputformat.split.minsize=240000000;\n-- set mapreduce.input.fileinputformat.split.maxsize=240000000;\n-- set mapreduce.input.fileinputformat.split.minsize.per.node=240000000;\n-- set mapreduce.input.fileinputformat.split.minsize.per.rack=240000000;\n--set hive.exec.parallel=true;\nset hive.stats.autogather=true;\n-- set hive.support.concurrency=false;\n-- set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;\n\nset hive.optimize.sort.dynamic.partition.threshold=0;\n"
  },
  {
    "path": "settings/load-partitioned.sql",
    "content": "-- set hive.enforce.bucketing=true;\n-- set hive.enforce.sorting=true;\nset hive.exec.dynamic.partition.mode=nonstrict;\nset hive.exec.max.dynamic.partitions.pernode=100000;\nset hive.exec.max.dynamic.partitions=100000;\nset hive.exec.max.created.files=1000000;\nset hive.exec.parallel=true;\nset hive.exec.reducers.max=${REDUCERS};\nset hive.stats.autogather=true;\nset hive.optimize.sort.dynamic.partition=true;\n\n-- set mapred.job.reduce.input.buffer.percent=0.0;\n-- set mapreduce.input.fileinputformat.split.minsize=240000000;\n-- set mapreduce.input.fileinputformat.split.minsize.per.node=240000000;\n-- set mapreduce.input.fileinputformat.split.minsize.per.rack=240000000;\nset hive.optimize.sort.dynamic.partition=true;\n-- set hive.tez.java.opts=-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/;\n\nset tez.runtime.empty.partitions.info-via-events.enabled=true;\nset tez.runtime.report.partition.stats=true;\n-- fewer files for the NULL partition\nset hive.tez.auto.reducer.parallelism=true;\nset hive.tez.min.partition.factor=0.01; \n\n-- set mapred.map.child.java.opts=-server -Xmx2800m -Djava.net.preferIPv4Stack=true;\n-- set mapred.reduce.child.java.opts=-server -Xms1024m -Xmx3800m -Djava.net.preferIPv4Stack=true;\n-- set mapreduce.map.memory.mb=3072;\n-- set mapreduce.reduce.memory.mb=4096;\n-- set io.sort.mb=800;\n\nset hive.optimize.sort.dynamic.partition.threshold=0;\n"
  },
  {
    "path": "spark-queries-tpcds/LICENSE",
    "content": "                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n\n=======================================================================\nApache Spark Subcomponents:\n\nThe Apache Spark project contains subcomponents with separate copyright\nnotices and license terms. Your use of the source code for the these\nsubcomponents is subject to the terms and conditions of the following\nlicenses.\n\n\n========================================================================\nFor heapq (pyspark/heapq3.py):\n========================================================================\n\nSee license/LICENSE-heapq.txt\n\n========================================================================\nFor SnapTree:\n========================================================================\n\nSee license/LICENSE-SnapTree.txt\n\n========================================================================\nFor jbcrypt:\n========================================================================\n\nSee license/LICENSE-jbcrypt.txt\n\n========================================================================\nBSD-style licenses\n========================================================================\n\nThe following components are provided under a BSD-style license. See project link for details.\nThe text of each license is also included at licenses/LICENSE-[project].txt.\n\n     (BSD 3 Clause) netlib core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)\n     (BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.2.7 - https://github.com/jpmml/jpmml-model)\n     (BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)\n     (BSD License) ANTLR 4.5.2-1 (org.antlr:antlr4:4.5.2-1 - http://wwww.antlr.org/)\n     (BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)\n     (BSD licence) ANTLR StringTemplate (org.antlr:stringtemplate:3.2.1 - http://www.stringtemplate.org)\n     (BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)\n     (BSD) JLine (jline:jline:0.9.94 - http://jline.sourceforge.net)\n     (BSD) ParaNamer Core (com.thoughtworks.paranamer:paranamer:2.3 - http://paranamer.codehaus.org/paranamer)\n     (BSD) ParaNamer Core (com.thoughtworks.paranamer:paranamer:2.6 - http://paranamer.codehaus.org/paranamer)\n     (BSD 3 Clause) Scala (http://www.scala-lang.org/download/#License)\n        (Interpreter classes (all .scala files in repl/src/main/scala\n        except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),\n        and for SerializableMapWrapper in JavaUtils.scala)\n     (BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.8 - http://www.scala-lang.org/)\n     (BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.8 - http://www.scala-lang.org/)\n     (BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.8 - http://www.scala-lang.org/)\n     (BSD-like) Scala Library (org.scala-lang:scala-library:2.11.8 - http://www.scala-lang.org/)\n     (BSD-like) Scalap (org.scala-lang:scalap:2.11.8 - http://www.scala-lang.org/)\n     (BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)\n     (BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)\n     (BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)\n     (New BSD License) Kryo (com.esotericsoftware:kryo:3.0.3 - https://github.com/EsotericSoftware/kryo)\n     (New BSD License) MinLog (com.esotericsoftware:minlog:1.3.0 - https://github.com/EsotericSoftware/minlog)\n     (New BSD license) Protocol Buffer Java API (com.google.protobuf:protobuf-java:2.5.0 - http://code.google.com/p/protobuf)\n     (New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)\n     (The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)\n     (The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)\n     (The New BSD License) Py4J (net.sf.py4j:py4j:0.10.6 - http://py4j.sourceforge.net/)\n     (Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)\n     (BSD licence) sbt and sbt-launch-lib.bash\n     (BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)\n     (BSD 3 Clause) DPark (https://github.com/douban/dpark/blob/master/LICENSE)\n     (BSD 3 Clause) CloudPickle (https://github.com/cloudpipe/cloudpickle/blob/master/LICENSE)\n\n========================================================================\nMIT licenses\n========================================================================\n\nThe following components are provided under the MIT License. See project link for details.\nThe text of each license is also included at licenses/LICENSE-[project].txt.\n\n     (MIT License) JCL 1.1.1 implemented over SLF4J (org.slf4j:jcl-over-slf4j:1.7.5 - http://www.slf4j.org)\n     (MIT License) JUL to SLF4J bridge (org.slf4j:jul-to-slf4j:1.7.5 - http://www.slf4j.org)\n     (MIT License) SLF4J API Module (org.slf4j:slf4j-api:1.7.5 - http://www.slf4j.org)\n     (MIT License) SLF4J LOG4J-12 Binding (org.slf4j:slf4j-log4j12:1.7.5 - http://www.slf4j.org)\n     (MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)\n     (MIT License) scopt (com.github.scopt:scopt_2.11:3.2.0 - https://github.com/scopt/scopt)\n     (The MIT License) Mockito (org.mockito:mockito-core:1.9.5 - http://www.mockito.org)\n     (MIT License) jquery (https://jquery.org/license/)\n     (MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)\n     (MIT License) graphlib-dot (https://github.com/cpettitt/graphlib-dot)\n     (MIT License) dagre-d3 (https://github.com/cpettitt/dagre-d3)\n     (MIT License) sorttable (https://github.com/stuartlangridge/sorttable)\n     (MIT License) boto (https://github.com/boto/boto/blob/develop/LICENSE)\n     (MIT License) datatables (http://datatables.net/license)\n     (MIT License) mustache (https://github.com/mustache/mustache/blob/master/LICENSE)\n     (MIT License) cookies (http://code.google.com/p/cookies/wiki/License)\n     (MIT License) blockUI (http://jquery.malsup.com/block/)\n     (MIT License) RowsGroup (http://datatables.net/license/mit)\n     (MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)\n     (MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)\n     (MIT License) machinist (https://github.com/typelevel/machinist)\n"
  },
  {
    "path": "spark-queries-tpcds/README.md",
    "content": "These are the full 99 TPC-DS queries from Apache Spark 2.2.\n\n- https://github.com/apache/spark/tree/master/sql/core/src/test/resources/tpcds\n"
  },
  {
    "path": "spark-queries-tpcds/q1.sql",
    "content": "WITH customer_total_return AS\n( SELECT\n    sr_customer_sk AS ctr_customer_sk,\n    sr_store_sk AS ctr_store_sk,\n    sum(sr_return_amt) AS ctr_total_return\n  FROM store_returns, date_dim\n  WHERE sr_returned_date_sk = d_date_sk AND d_year = 2000\n  GROUP BY sr_customer_sk, sr_store_sk)\nSELECT c_customer_id\nFROM customer_total_return ctr1, store, customer\nWHERE ctr1.ctr_total_return >\n  (SELECT avg(ctr_total_return) * 1.2\n  FROM customer_total_return ctr2\n  WHERE ctr1.ctr_store_sk = ctr2.ctr_store_sk)\n  AND s_store_sk = ctr1.ctr_store_sk\n  AND s_state = 'TN'\n  AND ctr1.ctr_customer_sk = c_customer_sk\nORDER BY c_customer_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q10.sql",
    "content": "SELECT\n  cd_gender,\n  cd_marital_status,\n  cd_education_status,\n  count(*) cnt1,\n  cd_purchase_estimate,\n  count(*) cnt2,\n  cd_credit_rating,\n  count(*) cnt3,\n  cd_dep_count,\n  count(*) cnt4,\n  cd_dep_employed_count,\n  count(*) cnt5,\n  cd_dep_college_count,\n  count(*) cnt6\nFROM\n  customer c, customer_address ca, customer_demographics\nWHERE\n  c.c_current_addr_sk = ca.ca_address_sk AND\n    ca_county IN ('Rush County', 'Toole County', 'Jefferson County',\n                  'Dona Ana County', 'La Porte County') AND\n    cd_demo_sk = c.c_current_cdemo_sk AND\n    exists(SELECT *\n           FROM store_sales, date_dim\n           WHERE c.c_customer_sk = ss_customer_sk AND\n             ss_sold_date_sk = d_date_sk AND\n             d_year = 2002 AND\n             d_moy BETWEEN 1 AND 1 + 3) AND\n    (exists(SELECT *\n            FROM web_sales, date_dim\n            WHERE c.c_customer_sk = ws_bill_customer_sk AND\n              ws_sold_date_sk = d_date_sk AND\n              d_year = 2002 AND\n              d_moy BETWEEN 1 AND 1 + 3) OR\n      exists(SELECT *\n             FROM catalog_sales, date_dim\n             WHERE c.c_customer_sk = cs_ship_customer_sk AND\n               cs_sold_date_sk = d_date_sk AND\n               d_year = 2002 AND\n               d_moy BETWEEN 1 AND 1 + 3))\nGROUP BY cd_gender,\n  cd_marital_status,\n  cd_education_status,\n  cd_purchase_estimate,\n  cd_credit_rating,\n  cd_dep_count,\n  cd_dep_employed_count,\n  cd_dep_college_count\nORDER BY cd_gender,\n  cd_marital_status,\n  cd_education_status,\n  cd_purchase_estimate,\n  cd_credit_rating,\n  cd_dep_count,\n  cd_dep_employed_count,\n  cd_dep_college_count\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q11.sql",
    "content": "WITH year_total AS (\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    c_preferred_cust_flag customer_preferred_cust_flag,\n    c_birth_country customer_birth_country,\n    c_login customer_login,\n    c_email_address customer_email_address,\n    d_year dyear,\n    sum(ss_ext_list_price - ss_ext_discount_amt) year_total,\n    's' sale_type\n  FROM customer, store_sales, date_dim\n  WHERE c_customer_sk = ss_customer_sk\n    AND ss_sold_date_sk = d_date_sk\n  GROUP BY c_customer_id\n    , c_first_name\n    , c_last_name\n    , d_year\n    , c_preferred_cust_flag\n    , c_birth_country\n    , c_login\n    , c_email_address\n    , d_year\n  UNION ALL\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    c_preferred_cust_flag customer_preferred_cust_flag,\n    c_birth_country customer_birth_country,\n    c_login customer_login,\n    c_email_address customer_email_address,\n    d_year dyear,\n    sum(ws_ext_list_price - ws_ext_discount_amt) year_total,\n    'w' sale_type\n  FROM customer, web_sales, date_dim\n  WHERE c_customer_sk = ws_bill_customer_sk\n    AND ws_sold_date_sk = d_date_sk\n  GROUP BY\n    c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country,\n    c_login, c_email_address, d_year)\nSELECT t_s_secyear.customer_preferred_cust_flag\nFROM year_total t_s_firstyear\n  , year_total t_s_secyear\n  , year_total t_w_firstyear\n  , year_total t_w_secyear\nWHERE t_s_secyear.customer_id = t_s_firstyear.customer_id\n  AND t_s_firstyear.customer_id = t_w_secyear.customer_id\n  AND t_s_firstyear.customer_id = t_w_firstyear.customer_id\n  AND t_s_firstyear.sale_type = 's'\n  AND t_w_firstyear.sale_type = 'w'\n  AND t_s_secyear.sale_type = 's'\n  AND t_w_secyear.sale_type = 'w'\n  AND t_s_firstyear.dyear = 2001\n  AND t_s_secyear.dyear = 2001 + 1\n  AND t_w_firstyear.dyear = 2001\n  AND t_w_secyear.dyear = 2001 + 1\n  AND t_s_firstyear.year_total > 0\n  AND t_w_firstyear.year_total > 0\n  AND CASE WHEN t_w_firstyear.year_total > 0\n  THEN t_w_secyear.year_total / t_w_firstyear.year_total\n      ELSE NULL END\n  > CASE WHEN t_s_firstyear.year_total > 0\n  THEN t_s_secyear.year_total / t_s_firstyear.year_total\n    ELSE NULL END\nORDER BY t_s_secyear.customer_preferred_cust_flag\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q12.sql",
    "content": "SELECT\n  i_item_desc,\n  i_category,\n  i_class,\n  i_current_price,\n  sum(ws_ext_sales_price) AS itemrevenue,\n  sum(ws_ext_sales_price) * 100 / sum(sum(ws_ext_sales_price))\n  OVER\n  (PARTITION BY i_class) AS revenueratio\nFROM\n  web_sales, item, date_dim\nWHERE\n  ws_item_sk = i_item_sk\n    AND i_category IN ('Sports', 'Books', 'Home')\n    AND ws_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('1999-02-22' AS DATE)\n  AND (cast('1999-02-22' AS DATE) + INTERVAL 30 days)\nGROUP BY\n  i_item_id, i_item_desc, i_category, i_class, i_current_price\nORDER BY\n  i_category, i_class, i_item_id, i_item_desc, revenueratio\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q13.sql",
    "content": "SELECT\n  avg(ss_quantity),\n  avg(ss_ext_sales_price),\n  avg(ss_ext_wholesale_cost),\n  sum(ss_ext_wholesale_cost)\nFROM store_sales\n  , store\n  , customer_demographics\n  , household_demographics\n  , customer_address\n  , date_dim\nWHERE s_store_sk = ss_store_sk\n  AND ss_sold_date_sk = d_date_sk AND d_year = 2001\n  AND ((ss_hdemo_sk = hd_demo_sk\n  AND cd_demo_sk = ss_cdemo_sk\n  AND cd_marital_status = 'M'\n  AND cd_education_status = 'Advanced Degree'\n  AND ss_sales_price BETWEEN 100.00 AND 150.00\n  AND hd_dep_count = 3\n) OR\n  (ss_hdemo_sk = hd_demo_sk\n    AND cd_demo_sk = ss_cdemo_sk\n    AND cd_marital_status = 'S'\n    AND cd_education_status = 'College'\n    AND ss_sales_price BETWEEN 50.00 AND 100.00\n    AND hd_dep_count = 1\n  ) OR\n  (ss_hdemo_sk = hd_demo_sk\n    AND cd_demo_sk = ss_cdemo_sk\n    AND cd_marital_status = 'W'\n    AND cd_education_status = '2 yr Degree'\n    AND ss_sales_price BETWEEN 150.00 AND 200.00\n    AND hd_dep_count = 1\n  ))\n  AND ((ss_addr_sk = ca_address_sk\n  AND ca_country = 'United States'\n  AND ca_state IN ('TX', 'OH', 'TX')\n  AND ss_net_profit BETWEEN 100 AND 200\n) OR\n  (ss_addr_sk = ca_address_sk\n    AND ca_country = 'United States'\n    AND ca_state IN ('OR', 'NM', 'KY')\n    AND ss_net_profit BETWEEN 150 AND 300\n  ) OR\n  (ss_addr_sk = ca_address_sk\n    AND ca_country = 'United States'\n    AND ca_state IN ('VA', 'TX', 'MS')\n    AND ss_net_profit BETWEEN 50 AND 250\n  ))\n"
  },
  {
    "path": "spark-queries-tpcds/q14a.sql",
    "content": "WITH cross_items AS\n(SELECT i_item_sk ss_item_sk\n  FROM item,\n    (SELECT\n      iss.i_brand_id brand_id,\n      iss.i_class_id class_id,\n      iss.i_category_id category_id\n    FROM store_sales, item iss, date_dim d1\n    WHERE ss_item_sk = iss.i_item_sk\n      AND ss_sold_date_sk = d1.d_date_sk\n      AND d1.d_year BETWEEN 1999 AND 1999 + 2\n    INTERSECT\n    SELECT\n      ics.i_brand_id,\n      ics.i_class_id,\n      ics.i_category_id\n    FROM catalog_sales, item ics, date_dim d2\n    WHERE cs_item_sk = ics.i_item_sk\n      AND cs_sold_date_sk = d2.d_date_sk\n      AND d2.d_year BETWEEN 1999 AND 1999 + 2\n    INTERSECT\n    SELECT\n      iws.i_brand_id,\n      iws.i_class_id,\n      iws.i_category_id\n    FROM web_sales, item iws, date_dim d3\n    WHERE ws_item_sk = iws.i_item_sk\n      AND ws_sold_date_sk = d3.d_date_sk\n      AND d3.d_year BETWEEN 1999 AND 1999 + 2) x\n  WHERE i_brand_id = brand_id\n    AND i_class_id = class_id\n    AND i_category_id = category_id\n),\n    avg_sales AS\n  (SELECT avg(quantity * list_price) average_sales\n  FROM (\n         SELECT\n           ss_quantity quantity,\n           ss_list_price list_price\n         FROM store_sales, date_dim\n         WHERE ss_sold_date_sk = d_date_sk\n           AND d_year BETWEEN 1999 AND 2001\n         UNION ALL\n         SELECT\n           cs_quantity quantity,\n           cs_list_price list_price\n         FROM catalog_sales, date_dim\n         WHERE cs_sold_date_sk = d_date_sk\n           AND d_year BETWEEN 1999 AND 1999 + 2\n         UNION ALL\n         SELECT\n           ws_quantity quantity,\n           ws_list_price list_price\n         FROM web_sales, date_dim\n         WHERE ws_sold_date_sk = d_date_sk\n           AND d_year BETWEEN 1999 AND 1999 + 2) x)\nSELECT\n  channel,\n  i_brand_id,\n  i_class_id,\n  i_category_id,\n  sum(sales),\n  sum(number_sales)\nFROM (\n       SELECT\n         'store' channel,\n         i_brand_id,\n         i_class_id,\n         i_category_id,\n         sum(ss_quantity * ss_list_price) sales,\n         count(*) number_sales\n       FROM store_sales, item, date_dim\n       WHERE ss_item_sk IN (SELECT ss_item_sk\n       FROM cross_items)\n         AND ss_item_sk = i_item_sk\n         AND ss_sold_date_sk = d_date_sk\n         AND d_year = 1999 + 2\n         AND d_moy = 11\n       GROUP BY i_brand_id, i_class_id, i_category_id\n       HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales\n       FROM avg_sales)\n       UNION ALL\n       SELECT\n         'catalog' channel,\n         i_brand_id,\n         i_class_id,\n         i_category_id,\n         sum(cs_quantity * cs_list_price) sales,\n         count(*) number_sales\n       FROM catalog_sales, item, date_dim\n       WHERE cs_item_sk IN (SELECT ss_item_sk\n       FROM cross_items)\n         AND cs_item_sk = i_item_sk\n         AND cs_sold_date_sk = d_date_sk\n         AND d_year = 1999 + 2\n         AND d_moy = 11\n       GROUP BY i_brand_id, i_class_id, i_category_id\n       HAVING sum(cs_quantity * cs_list_price) > (SELECT average_sales FROM avg_sales)\n       UNION ALL\n       SELECT\n         'web' channel,\n         i_brand_id,\n         i_class_id,\n         i_category_id,\n         sum(ws_quantity * ws_list_price) sales,\n         count(*) number_sales\n       FROM web_sales, item, date_dim\n       WHERE ws_item_sk IN (SELECT ss_item_sk\n       FROM cross_items)\n         AND ws_item_sk = i_item_sk\n         AND ws_sold_date_sk = d_date_sk\n         AND d_year = 1999 + 2\n         AND d_moy = 11\n       GROUP BY i_brand_id, i_class_id, i_category_id\n       HAVING sum(ws_quantity * ws_list_price) > (SELECT average_sales\n       FROM avg_sales)\n     ) y\nGROUP BY ROLLUP (channel, i_brand_id, i_class_id, i_category_id)\nORDER BY channel, i_brand_id, i_class_id, i_category_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q14b.sql",
    "content": "WITH cross_items AS\n(SELECT i_item_sk ss_item_sk\n  FROM item,\n    (SELECT\n      iss.i_brand_id brand_id,\n      iss.i_class_id class_id,\n      iss.i_category_id category_id\n    FROM store_sales, item iss, date_dim d1\n    WHERE ss_item_sk = iss.i_item_sk\n      AND ss_sold_date_sk = d1.d_date_sk\n      AND d1.d_year BETWEEN 1999 AND 1999 + 2\n    INTERSECT\n    SELECT\n      ics.i_brand_id,\n      ics.i_class_id,\n      ics.i_category_id\n    FROM catalog_sales, item ics, date_dim d2\n    WHERE cs_item_sk = ics.i_item_sk\n      AND cs_sold_date_sk = d2.d_date_sk\n      AND d2.d_year BETWEEN 1999 AND 1999 + 2\n    INTERSECT\n    SELECT\n      iws.i_brand_id,\n      iws.i_class_id,\n      iws.i_category_id\n    FROM web_sales, item iws, date_dim d3\n    WHERE ws_item_sk = iws.i_item_sk\n      AND ws_sold_date_sk = d3.d_date_sk\n      AND d3.d_year BETWEEN 1999 AND 1999 + 2) x\n  WHERE i_brand_id = brand_id\n    AND i_class_id = class_id\n    AND i_category_id = category_id\n),\n    avg_sales AS\n  (SELECT avg(quantity * list_price) average_sales\n  FROM (SELECT\n          ss_quantity quantity,\n          ss_list_price list_price\n        FROM store_sales, date_dim\n        WHERE ss_sold_date_sk = d_date_sk AND d_year BETWEEN 1999 AND 1999 + 2\n        UNION ALL\n        SELECT\n          cs_quantity quantity,\n          cs_list_price list_price\n        FROM catalog_sales, date_dim\n        WHERE cs_sold_date_sk = d_date_sk AND d_year BETWEEN 1999 AND 1999 + 2\n        UNION ALL\n        SELECT\n          ws_quantity quantity,\n          ws_list_price list_price\n        FROM web_sales, date_dim\n        WHERE ws_sold_date_sk = d_date_sk AND d_year BETWEEN 1999 AND 1999 + 2) x)\nSELECT *\nFROM\n  (SELECT\n    'store' channel,\n    i_brand_id,\n    i_class_id,\n    i_category_id,\n    sum(ss_quantity * ss_list_price) sales,\n    count(*) number_sales\n  FROM store_sales, item, date_dim\n  WHERE ss_item_sk IN (SELECT ss_item_sk\n  FROM cross_items)\n    AND ss_item_sk = i_item_sk\n    AND ss_sold_date_sk = d_date_sk\n    AND d_week_seq = (SELECT d_week_seq\n  FROM date_dim\n  WHERE d_year = 1999 + 1 AND d_moy = 12 AND d_dom = 11)\n  GROUP BY i_brand_id, i_class_id, i_category_id\n  HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales\n  FROM avg_sales)) this_year,\n  (SELECT\n    'store' channel,\n    i_brand_id,\n    i_class_id,\n    i_category_id,\n    sum(ss_quantity * ss_list_price) sales,\n    count(*) number_sales\n  FROM store_sales, item, date_dim\n  WHERE ss_item_sk IN (SELECT ss_item_sk\n  FROM cross_items)\n    AND ss_item_sk = i_item_sk\n    AND ss_sold_date_sk = d_date_sk\n    AND d_week_seq = (SELECT d_week_seq\n  FROM date_dim\n  WHERE d_year = 1999 AND d_moy = 12 AND d_dom = 11)\n  GROUP BY i_brand_id, i_class_id, i_category_id\n  HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales\n  FROM avg_sales)) last_year\nWHERE this_year.i_brand_id = last_year.i_brand_id\n  AND this_year.i_class_id = last_year.i_class_id\n  AND this_year.i_category_id = last_year.i_category_id\nORDER BY this_year.channel, this_year.i_brand_id, this_year.i_class_id, this_year.i_category_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q15.sql",
    "content": "SELECT\n  ca_zip,\n  sum(cs_sales_price)\nFROM catalog_sales, customer, customer_address, date_dim\nWHERE cs_bill_customer_sk = c_customer_sk\n  AND c_current_addr_sk = ca_address_sk\n  AND (substr(ca_zip, 1, 5) IN ('85669', '86197', '88274', '83405', '86475',\n                                '85392', '85460', '80348', '81792')\n  OR ca_state IN ('CA', 'WA', 'GA')\n  OR cs_sales_price > 500)\n  AND cs_sold_date_sk = d_date_sk\n  AND d_qoy = 2 AND d_year = 2001\nGROUP BY ca_zip\nORDER BY ca_zip\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q16.sql",
    "content": "SELECT\n  count(DISTINCT cs_order_number) AS `order count `,\n  sum(cs_ext_ship_cost) AS `total shipping cost `,\n  sum(cs_net_profit) AS `total net profit `\nFROM\n  catalog_sales cs1, date_dim, customer_address, call_center\nWHERE\n  d_date BETWEEN '2002-02-01' AND (CAST('2002-02-01' AS DATE) + INTERVAL 60 days)\n    AND cs1.cs_ship_date_sk = d_date_sk\n    AND cs1.cs_ship_addr_sk = ca_address_sk\n    AND ca_state = 'GA'\n    AND cs1.cs_call_center_sk = cc_call_center_sk\n    AND cc_county IN\n    ('Williamson County', 'Williamson County', 'Williamson County', 'Williamson County', 'Williamson County')\n    AND EXISTS(SELECT *\n               FROM catalog_sales cs2\n               WHERE cs1.cs_order_number = cs2.cs_order_number\n                 AND cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk)\n    AND NOT EXISTS(SELECT *\n                   FROM catalog_returns cr1\n                   WHERE cs1.cs_order_number = cr1.cr_order_number)\nORDER BY count(DISTINCT cs_order_number)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q17.sql",
    "content": "SELECT\n  i_item_id,\n  i_item_desc,\n  s_state,\n  count(ss_quantity) AS store_sales_quantitycount,\n  avg(ss_quantity) AS store_sales_quantityave,\n  stddev_samp(ss_quantity) AS store_sales_quantitystdev,\n  stddev_samp(ss_quantity) / avg(ss_quantity) AS store_sales_quantitycov,\n  count(sr_return_quantity) as_store_returns_quantitycount,\n  avg(sr_return_quantity) as_store_returns_quantityave,\n  stddev_samp(sr_return_quantity) as_store_returns_quantitystdev,\n  stddev_samp(sr_return_quantity) / avg(sr_return_quantity) AS store_returns_quantitycov,\n  count(cs_quantity) AS catalog_sales_quantitycount,\n  avg(cs_quantity) AS catalog_sales_quantityave,\n  stddev_samp(cs_quantity) / avg(cs_quantity) AS catalog_sales_quantitystdev,\n  stddev_samp(cs_quantity) / avg(cs_quantity) AS catalog_sales_quantitycov\nFROM store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3, store, item\nWHERE d1.d_quarter_name = '2001Q1'\n  AND d1.d_date_sk = ss_sold_date_sk\n  AND i_item_sk = ss_item_sk\n  AND s_store_sk = ss_store_sk\n  AND ss_customer_sk = sr_customer_sk\n  AND ss_item_sk = sr_item_sk\n  AND ss_ticket_number = sr_ticket_number\n  AND sr_returned_date_sk = d2.d_date_sk\n  AND d2.d_quarter_name IN ('2001Q1', '2001Q2', '2001Q3')\n  AND sr_customer_sk = cs_bill_customer_sk\n  AND sr_item_sk = cs_item_sk\n  AND cs_sold_date_sk = d3.d_date_sk\n  AND d3.d_quarter_name IN ('2001Q1', '2001Q2', '2001Q3')\nGROUP BY i_item_id, i_item_desc, s_state\nORDER BY i_item_id, i_item_desc, s_state\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q18.sql",
    "content": "SELECT\n  i_item_id,\n  ca_country,\n  ca_state,\n  ca_county,\n  avg(cast(cs_quantity AS DECIMAL(12, 2))) agg1,\n  avg(cast(cs_list_price AS DECIMAL(12, 2))) agg2,\n  avg(cast(cs_coupon_amt AS DECIMAL(12, 2))) agg3,\n  avg(cast(cs_sales_price AS DECIMAL(12, 2))) agg4,\n  avg(cast(cs_net_profit AS DECIMAL(12, 2))) agg5,\n  avg(cast(c_birth_year AS DECIMAL(12, 2))) agg6,\n  avg(cast(cd1.cd_dep_count AS DECIMAL(12, 2))) agg7\nFROM catalog_sales, customer_demographics cd1,\n  customer_demographics cd2, customer, customer_address, date_dim, item\nWHERE cs_sold_date_sk = d_date_sk AND\n  cs_item_sk = i_item_sk AND\n  cs_bill_cdemo_sk = cd1.cd_demo_sk AND\n  cs_bill_customer_sk = c_customer_sk AND\n  cd1.cd_gender = 'F' AND\n  cd1.cd_education_status = 'Unknown' AND\n  c_current_cdemo_sk = cd2.cd_demo_sk AND\n  c_current_addr_sk = ca_address_sk AND\n  c_birth_month IN (1, 6, 8, 9, 12, 2) AND\n  d_year = 1998 AND\n  ca_state IN ('MS', 'IN', 'ND', 'OK', 'NM', 'VA', 'MS')\nGROUP BY ROLLUP (i_item_id, ca_country, ca_state, ca_county)\nORDER BY ca_country, ca_state, ca_county, i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q19.sql",
    "content": "SELECT\n  i_brand_id brand_id,\n  i_brand brand,\n  i_manufact_id,\n  i_manufact,\n  sum(ss_ext_sales_price) ext_price\nFROM date_dim, store_sales, item, customer, customer_address, store\nWHERE d_date_sk = ss_sold_date_sk\n  AND ss_item_sk = i_item_sk\n  AND i_manager_id = 8\n  AND d_moy = 11\n  AND d_year = 1998\n  AND ss_customer_sk = c_customer_sk\n  AND c_current_addr_sk = ca_address_sk\n  AND substr(ca_zip, 1, 5) <> substr(s_zip, 1, 5)\n  AND ss_store_sk = s_store_sk\nGROUP BY i_brand, i_brand_id, i_manufact_id, i_manufact\nORDER BY ext_price DESC, brand, brand_id, i_manufact_id, i_manufact\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q2.sql",
    "content": "WITH wscs AS\n( SELECT\n    sold_date_sk,\n    sales_price\n  FROM (SELECT\n    ws_sold_date_sk sold_date_sk,\n    ws_ext_sales_price sales_price\n  FROM web_sales) x\n  UNION ALL\n  (SELECT\n    cs_sold_date_sk sold_date_sk,\n    cs_ext_sales_price sales_price\n  FROM catalog_sales)),\n    wswscs AS\n  ( SELECT\n    d_week_seq,\n    sum(CASE WHEN (d_day_name = 'Sunday')\n      THEN sales_price\n        ELSE NULL END)\n    sun_sales,\n    sum(CASE WHEN (d_day_name = 'Monday')\n      THEN sales_price\n        ELSE NULL END)\n    mon_sales,\n    sum(CASE WHEN (d_day_name = 'Tuesday')\n      THEN sales_price\n        ELSE NULL END)\n    tue_sales,\n    sum(CASE WHEN (d_day_name = 'Wednesday')\n      THEN sales_price\n        ELSE NULL END)\n    wed_sales,\n    sum(CASE WHEN (d_day_name = 'Thursday')\n      THEN sales_price\n        ELSE NULL END)\n    thu_sales,\n    sum(CASE WHEN (d_day_name = 'Friday')\n      THEN sales_price\n        ELSE NULL END)\n    fri_sales,\n    sum(CASE WHEN (d_day_name = 'Saturday')\n      THEN sales_price\n        ELSE NULL END)\n    sat_sales\n  FROM wscs, date_dim\n  WHERE d_date_sk = sold_date_sk\n  GROUP BY d_week_seq)\nSELECT\n  d_week_seq1,\n  round(sun_sales1 / sun_sales2, 2),\n  round(mon_sales1 / mon_sales2, 2),\n  round(tue_sales1 / tue_sales2, 2),\n  round(wed_sales1 / wed_sales2, 2),\n  round(thu_sales1 / thu_sales2, 2),\n  round(fri_sales1 / fri_sales2, 2),\n  round(sat_sales1 / sat_sales2, 2)\nFROM\n  (SELECT\n    wswscs.d_week_seq d_week_seq1,\n    sun_sales sun_sales1,\n    mon_sales mon_sales1,\n    tue_sales tue_sales1,\n    wed_sales wed_sales1,\n    thu_sales thu_sales1,\n    fri_sales fri_sales1,\n    sat_sales sat_sales1\n  FROM wswscs, date_dim\n  WHERE date_dim.d_week_seq = wswscs.d_week_seq AND d_year = 2001) y,\n  (SELECT\n    wswscs.d_week_seq d_week_seq2,\n    sun_sales sun_sales2,\n    mon_sales mon_sales2,\n    tue_sales tue_sales2,\n    wed_sales wed_sales2,\n    thu_sales thu_sales2,\n    fri_sales fri_sales2,\n    sat_sales sat_sales2\n  FROM wswscs, date_dim\n  WHERE date_dim.d_week_seq = wswscs.d_week_seq AND d_year = 2001 + 1) z\nWHERE d_week_seq1 = d_week_seq2 - 53\nORDER BY d_week_seq1\n"
  },
  {
    "path": "spark-queries-tpcds/q20.sql",
    "content": "SELECT\n  i_item_desc,\n  i_category,\n  i_class,\n  i_current_price,\n  sum(cs_ext_sales_price) AS itemrevenue,\n  sum(cs_ext_sales_price) * 100 / sum(sum(cs_ext_sales_price))\n  OVER\n  (PARTITION BY i_class) AS revenueratio\nFROM catalog_sales, item, date_dim\nWHERE cs_item_sk = i_item_sk\n  AND i_category IN ('Sports', 'Books', 'Home')\n  AND cs_sold_date_sk = d_date_sk\n  AND d_date BETWEEN cast('1999-02-22' AS DATE)\nAND (cast('1999-02-22' AS DATE) + INTERVAL 30 days)\nGROUP BY i_item_id, i_item_desc, i_category, i_class, i_current_price\nORDER BY i_category, i_class, i_item_id, i_item_desc, revenueratio\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q21.sql",
    "content": "SELECT *\nFROM (\n       SELECT\n         w_warehouse_name,\n         i_item_id,\n         sum(CASE WHEN (cast(d_date AS DATE) < cast('2000-03-11' AS DATE))\n           THEN inv_quantity_on_hand\n             ELSE 0 END) AS inv_before,\n         sum(CASE WHEN (cast(d_date AS DATE) >= cast('2000-03-11' AS DATE))\n           THEN inv_quantity_on_hand\n             ELSE 0 END) AS inv_after\n       FROM inventory, warehouse, item, date_dim\n       WHERE i_current_price BETWEEN 0.99 AND 1.49\n         AND i_item_sk = inv_item_sk\n         AND inv_warehouse_sk = w_warehouse_sk\n         AND inv_date_sk = d_date_sk\n         AND d_date BETWEEN (cast('2000-03-11' AS DATE) - INTERVAL 30 days)\n       AND (cast('2000-03-11' AS DATE) + INTERVAL 30 days)\n       GROUP BY w_warehouse_name, i_item_id) x\nWHERE (CASE WHEN inv_before > 0\n  THEN inv_after / inv_before\n       ELSE NULL\n       END) BETWEEN 2.0 / 3.0 AND 3.0 / 2.0\nORDER BY w_warehouse_name, i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q22.sql",
    "content": "SELECT\n  i_product_name,\n  i_brand,\n  i_class,\n  i_category,\n  avg(inv_quantity_on_hand) qoh\nFROM inventory, date_dim, item, warehouse\nWHERE inv_date_sk = d_date_sk\n  AND inv_item_sk = i_item_sk\n  AND inv_warehouse_sk = w_warehouse_sk\n  AND d_month_seq BETWEEN 1200 AND 1200 + 11\nGROUP BY ROLLUP (i_product_name, i_brand, i_class, i_category)\nORDER BY qoh, i_product_name, i_brand, i_class, i_category\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q23a.sql",
    "content": "WITH frequent_ss_items AS\n(SELECT\n    substr(i_item_desc, 1, 30) itemdesc,\n    i_item_sk item_sk,\n    d_date solddate,\n    count(*) cnt\n  FROM store_sales, date_dim, item\n  WHERE ss_sold_date_sk = d_date_sk\n    AND ss_item_sk = i_item_sk\n    AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3)\n  GROUP BY substr(i_item_desc, 1, 30), i_item_sk, d_date\n  HAVING count(*) > 4),\n    max_store_sales AS\n  (SELECT max(csales) tpcds_cmax\n  FROM (SELECT\n    c_customer_sk,\n    sum(ss_quantity * ss_sales_price) csales\n  FROM store_sales, customer, date_dim\n  WHERE ss_customer_sk = c_customer_sk\n    AND ss_sold_date_sk = d_date_sk\n    AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3)\n  GROUP BY c_customer_sk) x),\n    best_ss_customer AS\n  (SELECT\n    c_customer_sk,\n    sum(ss_quantity * ss_sales_price) ssales\n  FROM store_sales, customer\n  WHERE ss_customer_sk = c_customer_sk\n  GROUP BY c_customer_sk\n  HAVING sum(ss_quantity * ss_sales_price) > (50 / 100.0) *\n    (SELECT *\n    FROM max_store_sales))\nSELECT sum(sales)\nFROM ((SELECT cs_quantity * cs_list_price sales\nFROM catalog_sales, date_dim\nWHERE d_year = 2000\n  AND d_moy = 2\n  AND cs_sold_date_sk = d_date_sk\n  AND cs_item_sk IN (SELECT item_sk\nFROM frequent_ss_items)\n  AND cs_bill_customer_sk IN (SELECT c_customer_sk\nFROM best_ss_customer))\n      UNION ALL\n      (SELECT ws_quantity * ws_list_price sales\n      FROM web_sales, date_dim\n      WHERE d_year = 2000\n        AND d_moy = 2\n        AND ws_sold_date_sk = d_date_sk\n        AND ws_item_sk IN (SELECT item_sk\n      FROM frequent_ss_items)\n        AND ws_bill_customer_sk IN (SELECT c_customer_sk\n      FROM best_ss_customer))) y\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q23b.sql",
    "content": "WITH frequent_ss_items AS\n(SELECT\n    substr(i_item_desc, 1, 30) itemdesc,\n    i_item_sk item_sk,\n    d_date solddate,\n    count(*) cnt\n  FROM store_sales, date_dim, item\n  WHERE ss_sold_date_sk = d_date_sk\n    AND ss_item_sk = i_item_sk\n    AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3)\n  GROUP BY substr(i_item_desc, 1, 30), i_item_sk, d_date\n  HAVING count(*) > 4),\n    max_store_sales AS\n  (SELECT max(csales) tpcds_cmax\n  FROM (SELECT\n    c_customer_sk,\n    sum(ss_quantity * ss_sales_price) csales\n  FROM store_sales, customer, date_dim\n  WHERE ss_customer_sk = c_customer_sk\n    AND ss_sold_date_sk = d_date_sk\n    AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3)\n  GROUP BY c_customer_sk) x),\n    best_ss_customer AS\n  (SELECT\n    c_customer_sk,\n    sum(ss_quantity * ss_sales_price) ssales\n  FROM store_sales\n    , customer\n  WHERE ss_customer_sk = c_customer_sk\n  GROUP BY c_customer_sk\n  HAVING sum(ss_quantity * ss_sales_price) > (50 / 100.0) *\n    (SELECT *\n    FROM max_store_sales))\nSELECT\n  c_last_name,\n  c_first_name,\n  sales\nFROM ((SELECT\n  c_last_name,\n  c_first_name,\n  sum(cs_quantity * cs_list_price) sales\nFROM catalog_sales, customer, date_dim\nWHERE d_year = 2000\n  AND d_moy = 2\n  AND cs_sold_date_sk = d_date_sk\n  AND cs_item_sk IN (SELECT item_sk\nFROM frequent_ss_items)\n  AND cs_bill_customer_sk IN (SELECT c_customer_sk\nFROM best_ss_customer)\n  AND cs_bill_customer_sk = c_customer_sk\nGROUP BY c_last_name, c_first_name)\n      UNION ALL\n      (SELECT\n        c_last_name,\n        c_first_name,\n        sum(ws_quantity * ws_list_price) sales\n      FROM web_sales, customer, date_dim\n      WHERE d_year = 2000\n        AND d_moy = 2\n        AND ws_sold_date_sk = d_date_sk\n        AND ws_item_sk IN (SELECT item_sk\n      FROM frequent_ss_items)\n        AND ws_bill_customer_sk IN (SELECT c_customer_sk\n      FROM best_ss_customer)\n        AND ws_bill_customer_sk = c_customer_sk\n      GROUP BY c_last_name, c_first_name)) y\nORDER BY c_last_name, c_first_name, sales\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q24a.sql",
    "content": "WITH ssales AS\n(SELECT\n    c_last_name,\n    c_first_name,\n    s_store_name,\n    ca_state,\n    s_state,\n    i_color,\n    i_current_price,\n    i_manager_id,\n    i_units,\n    i_size,\n    sum(ss_net_paid) netpaid\n  FROM store_sales, store_returns, store, item, customer, customer_address\n  WHERE ss_ticket_number = sr_ticket_number\n    AND ss_item_sk = sr_item_sk\n    AND ss_customer_sk = c_customer_sk\n    AND ss_item_sk = i_item_sk\n    AND ss_store_sk = s_store_sk\n    AND c_birth_country = upper(ca_country)\n    AND s_zip = ca_zip\n    AND s_market_id = 8\n  GROUP BY c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color,\n    i_current_price, i_manager_id, i_units, i_size)\nSELECT\n  c_last_name,\n  c_first_name,\n  s_store_name,\n  sum(netpaid) paid\nFROM ssales\nWHERE i_color = 'pale'\nGROUP BY c_last_name, c_first_name, s_store_name\nHAVING sum(netpaid) > (SELECT 0.05 * avg(netpaid)\nFROM ssales)\n"
  },
  {
    "path": "spark-queries-tpcds/q24b.sql",
    "content": "WITH ssales AS\n(SELECT\n    c_last_name,\n    c_first_name,\n    s_store_name,\n    ca_state,\n    s_state,\n    i_color,\n    i_current_price,\n    i_manager_id,\n    i_units,\n    i_size,\n    sum(ss_net_paid) netpaid\n  FROM store_sales, store_returns, store, item, customer, customer_address\n  WHERE ss_ticket_number = sr_ticket_number\n    AND ss_item_sk = sr_item_sk\n    AND ss_customer_sk = c_customer_sk\n    AND ss_item_sk = i_item_sk\n    AND ss_store_sk = s_store_sk\n    AND c_birth_country = upper(ca_country)\n    AND s_zip = ca_zip\n    AND s_market_id = 8\n  GROUP BY c_last_name, c_first_name, s_store_name, ca_state, s_state,\n    i_color, i_current_price, i_manager_id, i_units, i_size)\nSELECT\n  c_last_name,\n  c_first_name,\n  s_store_name,\n  sum(netpaid) paid\nFROM ssales\nWHERE i_color = 'chiffon'\nGROUP BY c_last_name, c_first_name, s_store_name\nHAVING sum(netpaid) > (SELECT 0.05 * avg(netpaid)\nFROM ssales)\n"
  },
  {
    "path": "spark-queries-tpcds/q25.sql",
    "content": "SELECT\n  i_item_id,\n  i_item_desc,\n  s_store_id,\n  s_store_name,\n  sum(ss_net_profit) AS store_sales_profit,\n  sum(sr_net_loss) AS store_returns_loss,\n  sum(cs_net_profit) AS catalog_sales_profit\nFROM\n  store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2, date_dim d3,\n  store, item\nWHERE\n  d1.d_moy = 4\n    AND d1.d_year = 2001\n    AND d1.d_date_sk = ss_sold_date_sk\n    AND i_item_sk = ss_item_sk\n    AND s_store_sk = ss_store_sk\n    AND ss_customer_sk = sr_customer_sk\n    AND ss_item_sk = sr_item_sk\n    AND ss_ticket_number = sr_ticket_number\n    AND sr_returned_date_sk = d2.d_date_sk\n    AND d2.d_moy BETWEEN 4 AND 10\n    AND d2.d_year = 2001\n    AND sr_customer_sk = cs_bill_customer_sk\n    AND sr_item_sk = cs_item_sk\n    AND cs_sold_date_sk = d3.d_date_sk\n    AND d3.d_moy BETWEEN 4 AND 10\n    AND d3.d_year = 2001\nGROUP BY\n  i_item_id, i_item_desc, s_store_id, s_store_name\nORDER BY\n  i_item_id, i_item_desc, s_store_id, s_store_name\nLIMIT 100"
  },
  {
    "path": "spark-queries-tpcds/q26.sql",
    "content": "SELECT\n  i_item_id,\n  avg(cs_quantity) agg1,\n  avg(cs_list_price) agg2,\n  avg(cs_coupon_amt) agg3,\n  avg(cs_sales_price) agg4\nFROM catalog_sales, customer_demographics, date_dim, item, promotion\nWHERE cs_sold_date_sk = d_date_sk AND\n  cs_item_sk = i_item_sk AND\n  cs_bill_cdemo_sk = cd_demo_sk AND\n  cs_promo_sk = p_promo_sk AND\n  cd_gender = 'M' AND\n  cd_marital_status = 'S' AND\n  cd_education_status = 'College' AND\n  (p_channel_email = 'N' OR p_channel_event = 'N') AND\n  d_year = 2000\nGROUP BY i_item_id\nORDER BY i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q27.sql",
    "content": "SELECT\n  i_item_id,\n  s_state,\n  grouping(s_state) g_state,\n  avg(ss_quantity) agg1,\n  avg(ss_list_price) agg2,\n  avg(ss_coupon_amt) agg3,\n  avg(ss_sales_price) agg4\nFROM store_sales, customer_demographics, date_dim, store, item\nWHERE ss_sold_date_sk = d_date_sk AND\n  ss_item_sk = i_item_sk AND\n  ss_store_sk = s_store_sk AND\n  ss_cdemo_sk = cd_demo_sk AND\n  cd_gender = 'M' AND\n  cd_marital_status = 'S' AND\n  cd_education_status = 'College' AND\n  d_year = 2002 AND\n  s_state IN ('TN', 'TN', 'TN', 'TN', 'TN', 'TN')\nGROUP BY ROLLUP (i_item_id, s_state)\nORDER BY i_item_id, s_state\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q28.sql",
    "content": "SELECT *\nFROM (SELECT\n  avg(ss_list_price) B1_LP,\n  count(ss_list_price) B1_CNT,\n  count(DISTINCT ss_list_price) B1_CNTD\nFROM store_sales\nWHERE ss_quantity BETWEEN 0 AND 5\n  AND (ss_list_price BETWEEN 8 AND 8 + 10\n  OR ss_coupon_amt BETWEEN 459 AND 459 + 1000\n  OR ss_wholesale_cost BETWEEN 57 AND 57 + 20)) B1,\n  (SELECT\n    avg(ss_list_price) B2_LP,\n    count(ss_list_price) B2_CNT,\n    count(DISTINCT ss_list_price) B2_CNTD\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 6 AND 10\n    AND (ss_list_price BETWEEN 90 AND 90 + 10\n    OR ss_coupon_amt BETWEEN 2323 AND 2323 + 1000\n    OR ss_wholesale_cost BETWEEN 31 AND 31 + 20)) B2,\n  (SELECT\n    avg(ss_list_price) B3_LP,\n    count(ss_list_price) B3_CNT,\n    count(DISTINCT ss_list_price) B3_CNTD\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 11 AND 15\n    AND (ss_list_price BETWEEN 142 AND 142 + 10\n    OR ss_coupon_amt BETWEEN 12214 AND 12214 + 1000\n    OR ss_wholesale_cost BETWEEN 79 AND 79 + 20)) B3,\n  (SELECT\n    avg(ss_list_price) B4_LP,\n    count(ss_list_price) B4_CNT,\n    count(DISTINCT ss_list_price) B4_CNTD\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 16 AND 20\n    AND (ss_list_price BETWEEN 135 AND 135 + 10\n    OR ss_coupon_amt BETWEEN 6071 AND 6071 + 1000\n    OR ss_wholesale_cost BETWEEN 38 AND 38 + 20)) B4,\n  (SELECT\n    avg(ss_list_price) B5_LP,\n    count(ss_list_price) B5_CNT,\n    count(DISTINCT ss_list_price) B5_CNTD\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 21 AND 25\n    AND (ss_list_price BETWEEN 122 AND 122 + 10\n    OR ss_coupon_amt BETWEEN 836 AND 836 + 1000\n    OR ss_wholesale_cost BETWEEN 17 AND 17 + 20)) B5,\n  (SELECT\n    avg(ss_list_price) B6_LP,\n    count(ss_list_price) B6_CNT,\n    count(DISTINCT ss_list_price) B6_CNTD\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 26 AND 30\n    AND (ss_list_price BETWEEN 154 AND 154 + 10\n    OR ss_coupon_amt BETWEEN 7326 AND 7326 + 1000\n    OR ss_wholesale_cost BETWEEN 7 AND 7 + 20)) B6\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q29.sql",
    "content": "SELECT\n  i_item_id,\n  i_item_desc,\n  s_store_id,\n  s_store_name,\n  sum(ss_quantity) AS store_sales_quantity,\n  sum(sr_return_quantity) AS store_returns_quantity,\n  sum(cs_quantity) AS catalog_sales_quantity\nFROM\n  store_sales, store_returns, catalog_sales, date_dim d1, date_dim d2,\n  date_dim d3, store, item\nWHERE\n  d1.d_moy = 9\n    AND d1.d_year = 1999\n    AND d1.d_date_sk = ss_sold_date_sk\n    AND i_item_sk = ss_item_sk\n    AND s_store_sk = ss_store_sk\n    AND ss_customer_sk = sr_customer_sk\n    AND ss_item_sk = sr_item_sk\n    AND ss_ticket_number = sr_ticket_number\n    AND sr_returned_date_sk = d2.d_date_sk\n    AND d2.d_moy BETWEEN 9 AND 9 + 3\n    AND d2.d_year = 1999\n    AND sr_customer_sk = cs_bill_customer_sk\n    AND sr_item_sk = cs_item_sk\n    AND cs_sold_date_sk = d3.d_date_sk\n    AND d3.d_year IN (1999, 1999 + 1, 1999 + 2)\nGROUP BY\n  i_item_id, i_item_desc, s_store_id, s_store_name\nORDER BY\n  i_item_id, i_item_desc, s_store_id, s_store_name\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q3.sql",
    "content": "SELECT\n  dt.d_year,\n  item.i_brand_id brand_id,\n  item.i_brand brand,\n  SUM(ss_ext_sales_price) sum_agg\nFROM date_dim dt, store_sales, item\nWHERE dt.d_date_sk = store_sales.ss_sold_date_sk\n  AND store_sales.ss_item_sk = item.i_item_sk\n  AND item.i_manufact_id = 128\n  AND dt.d_moy = 11\nGROUP BY dt.d_year, item.i_brand, item.i_brand_id\nORDER BY dt.d_year, sum_agg DESC, brand_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q30.sql",
    "content": "WITH customer_total_return AS\n(SELECT\n    wr_returning_customer_sk AS ctr_customer_sk,\n    ca_state AS ctr_state,\n    sum(wr_return_amt) AS ctr_total_return\n  FROM web_returns, date_dim, customer_address\n  WHERE wr_returned_date_sk = d_date_sk\n    AND d_year = 2002\n    AND wr_returning_addr_sk = ca_address_sk\n  GROUP BY wr_returning_customer_sk, ca_state)\nSELECT\n  c_customer_id,\n  c_salutation,\n  c_first_name,\n  c_last_name,\n  c_preferred_cust_flag,\n  c_birth_day,\n  c_birth_month,\n  c_birth_year,\n  c_birth_country,\n  c_login,\n  c_email_address,\n  c_last_review_date,\n  ctr_total_return\nFROM customer_total_return ctr1, customer_address, customer\nWHERE ctr1.ctr_total_return > (SELECT avg(ctr_total_return) * 1.2\nFROM customer_total_return ctr2\nWHERE ctr1.ctr_state = ctr2.ctr_state)\n  AND ca_address_sk = c_current_addr_sk\n  AND ca_state = 'GA'\n  AND ctr1.ctr_customer_sk = c_customer_sk\nORDER BY c_customer_id, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag\n  , c_birth_day, c_birth_month, c_birth_year, c_birth_country, c_login, c_email_address\n  , c_last_review_date, ctr_total_return\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q31.sql",
    "content": "WITH ss AS\n(SELECT\n    ca_county,\n    d_qoy,\n    d_year,\n    sum(ss_ext_sales_price) AS store_sales\n  FROM store_sales, date_dim, customer_address\n  WHERE ss_sold_date_sk = d_date_sk\n    AND ss_addr_sk = ca_address_sk\n  GROUP BY ca_county, d_qoy, d_year),\n    ws AS\n  (SELECT\n    ca_county,\n    d_qoy,\n    d_year,\n    sum(ws_ext_sales_price) AS web_sales\n  FROM web_sales, date_dim, customer_address\n  WHERE ws_sold_date_sk = d_date_sk\n    AND ws_bill_addr_sk = ca_address_sk\n  GROUP BY ca_county, d_qoy, d_year)\nSELECT\n  ss1.ca_county,\n  ss1.d_year,\n  ws2.web_sales / ws1.web_sales web_q1_q2_increase,\n  ss2.store_sales / ss1.store_sales store_q1_q2_increase,\n  ws3.web_sales / ws2.web_sales web_q2_q3_increase,\n  ss3.store_sales / ss2.store_sales store_q2_q3_increase\nFROM\n  ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3\nWHERE\n  ss1.d_qoy = 1\n    AND ss1.d_year = 2000\n    AND ss1.ca_county = ss2.ca_county\n    AND ss2.d_qoy = 2\n    AND ss2.d_year = 2000\n    AND ss2.ca_county = ss3.ca_county\n    AND ss3.d_qoy = 3\n    AND ss3.d_year = 2000\n    AND ss1.ca_county = ws1.ca_county\n    AND ws1.d_qoy = 1\n    AND ws1.d_year = 2000\n    AND ws1.ca_county = ws2.ca_county\n    AND ws2.d_qoy = 2\n    AND ws2.d_year = 2000\n    AND ws1.ca_county = ws3.ca_county\n    AND ws3.d_qoy = 3\n    AND ws3.d_year = 2000\n    AND CASE WHEN ws1.web_sales > 0\n    THEN ws2.web_sales / ws1.web_sales\n        ELSE NULL END\n    > CASE WHEN ss1.store_sales > 0\n    THEN ss2.store_sales / ss1.store_sales\n      ELSE NULL END\n    AND CASE WHEN ws2.web_sales > 0\n    THEN ws3.web_sales / ws2.web_sales\n        ELSE NULL END\n    > CASE WHEN ss2.store_sales > 0\n    THEN ss3.store_sales / ss2.store_sales\n      ELSE NULL END\nORDER BY ss1.ca_county\n"
  },
  {
    "path": "spark-queries-tpcds/q32.sql",
    "content": "SELECT 1 AS `excess discount amount `\nFROM\n  catalog_sales, item, date_dim\nWHERE\n  i_manufact_id = 977\n    AND i_item_sk = cs_item_sk\n    AND d_date BETWEEN '2000-01-27' AND (cast('2000-01-27' AS DATE) + interval 90 days)\n    AND d_date_sk = cs_sold_date_sk\n    AND cs_ext_discount_amt > (\n    SELECT 1.3 * avg(cs_ext_discount_amt)\n    FROM catalog_sales, date_dim\n    WHERE cs_item_sk = i_item_sk\n      AND d_date BETWEEN '2000-01-27]' AND (cast('2000-01-27' AS DATE) + interval 90 days)\n      AND d_date_sk = cs_sold_date_sk)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q33.sql",
    "content": "WITH ss AS (\n  SELECT\n    i_manufact_id,\n    sum(ss_ext_sales_price) total_sales\n  FROM\n    store_sales, date_dim, customer_address, item\n  WHERE\n    i_manufact_id IN (SELECT i_manufact_id\n    FROM item\n    WHERE i_category IN ('Electronics'))\n      AND ss_item_sk = i_item_sk\n      AND ss_sold_date_sk = d_date_sk\n      AND d_year = 1998\n      AND d_moy = 5\n      AND ss_addr_sk = ca_address_sk\n      AND ca_gmt_offset = -5\n  GROUP BY i_manufact_id), cs AS\n(SELECT\n    i_manufact_id,\n    sum(cs_ext_sales_price) total_sales\n  FROM catalog_sales, date_dim, customer_address, item\n  WHERE\n    i_manufact_id IN (\n      SELECT i_manufact_id\n      FROM item\n      WHERE\n        i_category IN ('Electronics'))\n      AND cs_item_sk = i_item_sk\n      AND cs_sold_date_sk = d_date_sk\n      AND d_year = 1998\n      AND d_moy = 5\n      AND cs_bill_addr_sk = ca_address_sk\n      AND ca_gmt_offset = -5\n  GROUP BY i_manufact_id),\n    ws AS (\n    SELECT\n      i_manufact_id,\n      sum(ws_ext_sales_price) total_sales\n    FROM\n      web_sales, date_dim, customer_address, item\n    WHERE\n      i_manufact_id IN (SELECT i_manufact_id\n      FROM item\n      WHERE i_category IN ('Electronics'))\n        AND ws_item_sk = i_item_sk\n        AND ws_sold_date_sk = d_date_sk\n        AND d_year = 1998\n        AND d_moy = 5\n        AND ws_bill_addr_sk = ca_address_sk\n        AND ca_gmt_offset = -5\n    GROUP BY i_manufact_id)\nSELECT\n  i_manufact_id,\n  sum(total_sales) total_sales\nFROM (SELECT *\n      FROM ss\n      UNION ALL\n      SELECT *\n      FROM cs\n      UNION ALL\n      SELECT *\n      FROM ws) tmp1\nGROUP BY i_manufact_id\nORDER BY total_sales\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q34.sql",
    "content": "SELECT\n  c_last_name,\n  c_first_name,\n  c_salutation,\n  c_preferred_cust_flag,\n  ss_ticket_number,\n  cnt\nFROM\n  (SELECT\n    ss_ticket_number,\n    ss_customer_sk,\n    count(*) cnt\n  FROM store_sales, date_dim, store, household_demographics\n  WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    AND store_sales.ss_store_sk = store.s_store_sk\n    AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND (date_dim.d_dom BETWEEN 1 AND 3 OR date_dim.d_dom BETWEEN 25 AND 28)\n    AND (household_demographics.hd_buy_potential = '>10000' OR\n    household_demographics.hd_buy_potential = 'unknown')\n    AND household_demographics.hd_vehicle_count > 0\n    AND (CASE WHEN household_demographics.hd_vehicle_count > 0\n    THEN household_demographics.hd_dep_count / household_demographics.hd_vehicle_count\n         ELSE NULL\n         END) > 1.2\n    AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2)\n    AND store.s_county IN\n    ('Williamson County', 'Williamson County', 'Williamson County', 'Williamson County',\n     'Williamson County', 'Williamson County', 'Williamson County', 'Williamson County')\n  GROUP BY ss_ticket_number, ss_customer_sk) dn, customer\nWHERE ss_customer_sk = c_customer_sk\n  AND cnt BETWEEN 15 AND 20\nORDER BY c_last_name, c_first_name, c_salutation, c_preferred_cust_flag DESC\n"
  },
  {
    "path": "spark-queries-tpcds/q35.sql",
    "content": "SELECT\n  ca_state,\n  cd_gender,\n  cd_marital_status,\n  count(*) cnt1,\n  min(cd_dep_count),\n  max(cd_dep_count),\n  avg(cd_dep_count),\n  cd_dep_employed_count,\n  count(*) cnt2,\n  min(cd_dep_employed_count),\n  max(cd_dep_employed_count),\n  avg(cd_dep_employed_count),\n  cd_dep_college_count,\n  count(*) cnt3,\n  min(cd_dep_college_count),\n  max(cd_dep_college_count),\n  avg(cd_dep_college_count)\nFROM\n  customer c, customer_address ca, customer_demographics\nWHERE\n  c.c_current_addr_sk = ca.ca_address_sk AND\n    cd_demo_sk = c.c_current_cdemo_sk AND\n    exists(SELECT *\n           FROM store_sales, date_dim\n           WHERE c.c_customer_sk = ss_customer_sk AND\n             ss_sold_date_sk = d_date_sk AND\n             d_year = 2002 AND\n             d_qoy < 4) AND\n    (exists(SELECT *\n            FROM web_sales, date_dim\n            WHERE c.c_customer_sk = ws_bill_customer_sk AND\n              ws_sold_date_sk = d_date_sk AND\n              d_year = 2002 AND\n              d_qoy < 4) OR\n      exists(SELECT *\n             FROM catalog_sales, date_dim\n             WHERE c.c_customer_sk = cs_ship_customer_sk AND\n               cs_sold_date_sk = d_date_sk AND\n               d_year = 2002 AND\n               d_qoy < 4))\nGROUP BY ca_state, cd_gender, cd_marital_status, cd_dep_count,\n  cd_dep_employed_count, cd_dep_college_count\nORDER BY ca_state, cd_gender, cd_marital_status, cd_dep_count,\n  cd_dep_employed_count, cd_dep_college_count\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q36.sql",
    "content": "SELECT\n  sum(ss_net_profit) / sum(ss_ext_sales_price) AS gross_margin,\n  i_category,\n  i_class,\n  grouping(i_category) + grouping(i_class) AS lochierarchy,\n  rank()\n  OVER (\n    PARTITION BY grouping(i_category) + grouping(i_class),\n      CASE WHEN grouping(i_class) = 0\n        THEN i_category END\n    ORDER BY sum(ss_net_profit) / sum(ss_ext_sales_price) ASC) AS rank_within_parent\nFROM\n  store_sales, date_dim d1, item, store\nWHERE\n  d1.d_year = 2001\n    AND d1.d_date_sk = ss_sold_date_sk\n    AND i_item_sk = ss_item_sk\n    AND s_store_sk = ss_store_sk\n    AND s_state IN ('TN', 'TN', 'TN', 'TN', 'TN', 'TN', 'TN', 'TN')\nGROUP BY ROLLUP (i_category, i_class)\nORDER BY\n  lochierarchy DESC\n  , CASE WHEN lochierarchy = 0\n  THEN i_category END\n  , rank_within_parent\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q37.sql",
    "content": "SELECT\n  i_item_id,\n  i_item_desc,\n  i_current_price\nFROM item, inventory, date_dim, catalog_sales\nWHERE i_current_price BETWEEN 68 AND 68 + 30\n  AND inv_item_sk = i_item_sk\n  AND d_date_sk = inv_date_sk\n  AND d_date BETWEEN cast('2000-02-01' AS DATE) AND (cast('2000-02-01' AS DATE) + INTERVAL 60 days)\n  AND i_manufact_id IN (677, 940, 694, 808)\n  AND inv_quantity_on_hand BETWEEN 100 AND 500\n  AND cs_item_sk = i_item_sk\nGROUP BY i_item_id, i_item_desc, i_current_price\nORDER BY i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q38.sql",
    "content": "SELECT count(*)\nFROM (\n       SELECT DISTINCT\n         c_last_name,\n         c_first_name,\n         d_date\n       FROM store_sales, date_dim, customer\n       WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n         AND store_sales.ss_customer_sk = customer.c_customer_sk\n         AND d_month_seq BETWEEN 1200 AND 1200 + 11\n       INTERSECT\n       SELECT DISTINCT\n         c_last_name,\n         c_first_name,\n         d_date\n       FROM catalog_sales, date_dim, customer\n       WHERE catalog_sales.cs_sold_date_sk = date_dim.d_date_sk\n         AND catalog_sales.cs_bill_customer_sk = customer.c_customer_sk\n         AND d_month_seq BETWEEN 1200 AND 1200 + 11\n       INTERSECT\n       SELECT DISTINCT\n         c_last_name,\n         c_first_name,\n         d_date\n       FROM web_sales, date_dim, customer\n       WHERE web_sales.ws_sold_date_sk = date_dim.d_date_sk\n         AND web_sales.ws_bill_customer_sk = customer.c_customer_sk\n         AND d_month_seq BETWEEN 1200 AND 1200 + 11\n     ) hot_cust\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q39a.sql",
    "content": "WITH inv AS\n(SELECT\n    w_warehouse_name,\n    w_warehouse_sk,\n    i_item_sk,\n    d_moy,\n    stdev,\n    mean,\n    CASE mean\n    WHEN 0\n      THEN NULL\n    ELSE stdev / mean END cov\n  FROM (SELECT\n    w_warehouse_name,\n    w_warehouse_sk,\n    i_item_sk,\n    d_moy,\n    stddev_samp(inv_quantity_on_hand) stdev,\n    avg(inv_quantity_on_hand) mean\n  FROM inventory, item, warehouse, date_dim\n  WHERE inv_item_sk = i_item_sk\n    AND inv_warehouse_sk = w_warehouse_sk\n    AND inv_date_sk = d_date_sk\n    AND d_year = 2001\n  GROUP BY w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy) foo\n  WHERE CASE mean\n        WHEN 0\n          THEN 0\n        ELSE stdev / mean END > 1)\nSELECT\n  inv1.w_warehouse_sk,\n  inv1.i_item_sk,\n  inv1.d_moy,\n  inv1.mean,\n  inv1.cov,\n  inv2.w_warehouse_sk,\n  inv2.i_item_sk,\n  inv2.d_moy,\n  inv2.mean,\n  inv2.cov\nFROM inv inv1, inv inv2\nWHERE inv1.i_item_sk = inv2.i_item_sk\n  AND inv1.w_warehouse_sk = inv2.w_warehouse_sk\n  AND inv1.d_moy = 1\n  AND inv2.d_moy = 1 + 1\nORDER BY inv1.w_warehouse_sk, inv1.i_item_sk, inv1.d_moy, inv1.mean, inv1.cov\n  , inv2.d_moy, inv2.mean, inv2.cov\n"
  },
  {
    "path": "spark-queries-tpcds/q39b.sql",
    "content": "WITH inv AS\n(SELECT\n    w_warehouse_name,\n    w_warehouse_sk,\n    i_item_sk,\n    d_moy,\n    stdev,\n    mean,\n    CASE mean\n    WHEN 0\n      THEN NULL\n    ELSE stdev / mean END cov\n  FROM (SELECT\n    w_warehouse_name,\n    w_warehouse_sk,\n    i_item_sk,\n    d_moy,\n    stddev_samp(inv_quantity_on_hand) stdev,\n    avg(inv_quantity_on_hand) mean\n  FROM inventory, item, warehouse, date_dim\n  WHERE inv_item_sk = i_item_sk\n    AND inv_warehouse_sk = w_warehouse_sk\n    AND inv_date_sk = d_date_sk\n    AND d_year = 2001\n  GROUP BY w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy) foo\n  WHERE CASE mean\n        WHEN 0\n          THEN 0\n        ELSE stdev / mean END > 1)\nSELECT\n  inv1.w_warehouse_sk,\n  inv1.i_item_sk,\n  inv1.d_moy,\n  inv1.mean,\n  inv1.cov,\n  inv2.w_warehouse_sk,\n  inv2.i_item_sk,\n  inv2.d_moy,\n  inv2.mean,\n  inv2.cov\nFROM inv inv1, inv inv2\nWHERE inv1.i_item_sk = inv2.i_item_sk\n  AND inv1.w_warehouse_sk = inv2.w_warehouse_sk\n  AND inv1.d_moy = 1\n  AND inv2.d_moy = 1 + 1\n  AND inv1.cov > 1.5\nORDER BY inv1.w_warehouse_sk, inv1.i_item_sk, inv1.d_moy, inv1.mean, inv1.cov\n  , inv2.d_moy, inv2.mean, inv2.cov\n"
  },
  {
    "path": "spark-queries-tpcds/q4.sql",
    "content": "WITH year_total AS (\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    c_preferred_cust_flag customer_preferred_cust_flag,\n    c_birth_country customer_birth_country,\n    c_login customer_login,\n    c_email_address customer_email_address,\n    d_year dyear,\n    sum(((ss_ext_list_price - ss_ext_wholesale_cost - ss_ext_discount_amt) +\n      ss_ext_sales_price) / 2) year_total,\n    's' sale_type\n  FROM customer, store_sales, date_dim\n  WHERE c_customer_sk = ss_customer_sk AND ss_sold_date_sk = d_date_sk\n  GROUP BY c_customer_id,\n    c_first_name,\n    c_last_name,\n    c_preferred_cust_flag,\n    c_birth_country,\n    c_login,\n    c_email_address,\n    d_year\n  UNION ALL\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    c_preferred_cust_flag customer_preferred_cust_flag,\n    c_birth_country customer_birth_country,\n    c_login customer_login,\n    c_email_address customer_email_address,\n    d_year dyear,\n    sum((((cs_ext_list_price - cs_ext_wholesale_cost - cs_ext_discount_amt) +\n      cs_ext_sales_price) / 2)) year_total,\n    'c' sale_type\n  FROM customer, catalog_sales, date_dim\n  WHERE c_customer_sk = cs_bill_customer_sk AND cs_sold_date_sk = d_date_sk\n  GROUP BY c_customer_id,\n    c_first_name,\n    c_last_name,\n    c_preferred_cust_flag,\n    c_birth_country,\n    c_login,\n    c_email_address,\n    d_year\n  UNION ALL\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    c_preferred_cust_flag customer_preferred_cust_flag,\n    c_birth_country customer_birth_country,\n    c_login customer_login,\n    c_email_address customer_email_address,\n    d_year dyear,\n    sum((((ws_ext_list_price - ws_ext_wholesale_cost - ws_ext_discount_amt) + ws_ext_sales_price) /\n      2)) year_total,\n    'w' sale_type\n  FROM customer, web_sales, date_dim\n  WHERE c_customer_sk = ws_bill_customer_sk AND ws_sold_date_sk = d_date_sk\n  GROUP BY c_customer_id,\n    c_first_name,\n    c_last_name,\n    c_preferred_cust_flag,\n    c_birth_country,\n    c_login,\n    c_email_address,\n    d_year)\nSELECT\n  t_s_secyear.customer_id,\n  t_s_secyear.customer_first_name,\n  t_s_secyear.customer_last_name,\n  t_s_secyear.customer_preferred_cust_flag,\n  t_s_secyear.customer_birth_country,\n  t_s_secyear.customer_login,\n  t_s_secyear.customer_email_address\nFROM year_total t_s_firstyear, year_total t_s_secyear, year_total t_c_firstyear,\n  year_total t_c_secyear, year_total t_w_firstyear, year_total t_w_secyear\nWHERE t_s_secyear.customer_id = t_s_firstyear.customer_id\n  AND t_s_firstyear.customer_id = t_c_secyear.customer_id\n  AND t_s_firstyear.customer_id = t_c_firstyear.customer_id\n  AND t_s_firstyear.customer_id = t_w_firstyear.customer_id\n  AND t_s_firstyear.customer_id = t_w_secyear.customer_id\n  AND t_s_firstyear.sale_type = 's'\n  AND t_c_firstyear.sale_type = 'c'\n  AND t_w_firstyear.sale_type = 'w'\n  AND t_s_secyear.sale_type = 's'\n  AND t_c_secyear.sale_type = 'c'\n  AND t_w_secyear.sale_type = 'w'\n  AND t_s_firstyear.dyear = 2001\n  AND t_s_secyear.dyear = 2001 + 1\n  AND t_c_firstyear.dyear = 2001\n  AND t_c_secyear.dyear = 2001 + 1\n  AND t_w_firstyear.dyear = 2001\n  AND t_w_secyear.dyear = 2001 + 1\n  AND t_s_firstyear.year_total > 0\n  AND t_c_firstyear.year_total > 0\n  AND t_w_firstyear.year_total > 0\n  AND CASE WHEN t_c_firstyear.year_total > 0\n  THEN t_c_secyear.year_total / t_c_firstyear.year_total\n      ELSE NULL END\n  > CASE WHEN t_s_firstyear.year_total > 0\n  THEN t_s_secyear.year_total / t_s_firstyear.year_total\n    ELSE NULL END\n  AND CASE WHEN t_c_firstyear.year_total > 0\n  THEN t_c_secyear.year_total / t_c_firstyear.year_total\n      ELSE NULL END\n  > CASE WHEN t_w_firstyear.year_total > 0\n  THEN t_w_secyear.year_total / t_w_firstyear.year_total\n    ELSE NULL END\nORDER BY\n  t_s_secyear.customer_id,\n  t_s_secyear.customer_first_name,\n  t_s_secyear.customer_last_name,\n  t_s_secyear.customer_preferred_cust_flag,\n  t_s_secyear.customer_birth_country,\n  t_s_secyear.customer_login,\n  t_s_secyear.customer_email_address\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q40.sql",
    "content": "SELECT\n  w_state,\n  i_item_id,\n  sum(CASE WHEN (cast(d_date AS DATE) < cast('2000-03-11' AS DATE))\n    THEN cs_sales_price - coalesce(cr_refunded_cash, 0)\n      ELSE 0 END) AS sales_before,\n  sum(CASE WHEN (cast(d_date AS DATE) >= cast('2000-03-11' AS DATE))\n    THEN cs_sales_price - coalesce(cr_refunded_cash, 0)\n      ELSE 0 END) AS sales_after\nFROM\n  catalog_sales\n  LEFT OUTER JOIN catalog_returns ON\n                                    (cs_order_number = cr_order_number\n                                      AND cs_item_sk = cr_item_sk)\n  , warehouse, item, date_dim\nWHERE\n  i_current_price BETWEEN 0.99 AND 1.49\n    AND i_item_sk = cs_item_sk\n    AND cs_warehouse_sk = w_warehouse_sk\n    AND cs_sold_date_sk = d_date_sk\n    AND d_date BETWEEN (cast('2000-03-11' AS DATE) - INTERVAL 30 days)\n  AND (cast('2000-03-11' AS DATE) + INTERVAL 30 days)\nGROUP BY w_state, i_item_id\nORDER BY w_state, i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q41.sql",
    "content": "SELECT DISTINCT (i_product_name)\nFROM item i1\nWHERE i_manufact_id BETWEEN 738 AND 738 + 40\n  AND (SELECT count(*) AS item_cnt\nFROM item\nWHERE (i_manufact = i1.i_manufact AND\n  ((i_category = 'Women' AND\n    (i_color = 'powder' OR i_color = 'khaki') AND\n    (i_units = 'Ounce' OR i_units = 'Oz') AND\n    (i_size = 'medium' OR i_size = 'extra large')\n  ) OR\n    (i_category = 'Women' AND\n      (i_color = 'brown' OR i_color = 'honeydew') AND\n      (i_units = 'Bunch' OR i_units = 'Ton') AND\n      (i_size = 'N/A' OR i_size = 'small')\n    ) OR\n    (i_category = 'Men' AND\n      (i_color = 'floral' OR i_color = 'deep') AND\n      (i_units = 'N/A' OR i_units = 'Dozen') AND\n      (i_size = 'petite' OR i_size = 'large')\n    ) OR\n    (i_category = 'Men' AND\n      (i_color = 'light' OR i_color = 'cornflower') AND\n      (i_units = 'Box' OR i_units = 'Pound') AND\n      (i_size = 'medium' OR i_size = 'extra large')\n    ))) OR\n  (i_manufact = i1.i_manufact AND\n    ((i_category = 'Women' AND\n      (i_color = 'midnight' OR i_color = 'snow') AND\n      (i_units = 'Pallet' OR i_units = 'Gross') AND\n      (i_size = 'medium' OR i_size = 'extra large')\n    ) OR\n      (i_category = 'Women' AND\n        (i_color = 'cyan' OR i_color = 'papaya') AND\n        (i_units = 'Cup' OR i_units = 'Dram') AND\n        (i_size = 'N/A' OR i_size = 'small')\n      ) OR\n      (i_category = 'Men' AND\n        (i_color = 'orange' OR i_color = 'frosted') AND\n        (i_units = 'Each' OR i_units = 'Tbl') AND\n        (i_size = 'petite' OR i_size = 'large')\n      ) OR\n      (i_category = 'Men' AND\n        (i_color = 'forest' OR i_color = 'ghost') AND\n        (i_units = 'Lb' OR i_units = 'Bundle') AND\n        (i_size = 'medium' OR i_size = 'extra large')\n      )))) > 0\nORDER BY i_product_name\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q42.sql",
    "content": "SELECT\n  dt.d_year,\n  item.i_category_id,\n  item.i_category,\n  sum(ss_ext_sales_price)\nFROM date_dim dt, store_sales, item\nWHERE dt.d_date_sk = store_sales.ss_sold_date_sk\n  AND store_sales.ss_item_sk = item.i_item_sk\n  AND item.i_manager_id = 1\n  AND dt.d_moy = 11\n  AND dt.d_year = 2000\nGROUP BY dt.d_year\n  , item.i_category_id\n  , item.i_category\nORDER BY sum(ss_ext_sales_price) DESC, dt.d_year\n  , item.i_category_id\n  , item.i_category\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q43.sql",
    "content": "SELECT\n  s_store_name,\n  s_store_id,\n  sum(CASE WHEN (d_day_name = 'Sunday')\n    THEN ss_sales_price\n      ELSE NULL END) sun_sales,\n  sum(CASE WHEN (d_day_name = 'Monday')\n    THEN ss_sales_price\n      ELSE NULL END) mon_sales,\n  sum(CASE WHEN (d_day_name = 'Tuesday')\n    THEN ss_sales_price\n      ELSE NULL END) tue_sales,\n  sum(CASE WHEN (d_day_name = 'Wednesday')\n    THEN ss_sales_price\n      ELSE NULL END) wed_sales,\n  sum(CASE WHEN (d_day_name = 'Thursday')\n    THEN ss_sales_price\n      ELSE NULL END) thu_sales,\n  sum(CASE WHEN (d_day_name = 'Friday')\n    THEN ss_sales_price\n      ELSE NULL END) fri_sales,\n  sum(CASE WHEN (d_day_name = 'Saturday')\n    THEN ss_sales_price\n      ELSE NULL END) sat_sales\nFROM date_dim, store_sales, store\nWHERE d_date_sk = ss_sold_date_sk AND\n  s_store_sk = ss_store_sk AND\n  s_gmt_offset = -5 AND\n  d_year = 2000\nGROUP BY s_store_name, s_store_id\nORDER BY s_store_name, s_store_id, sun_sales, mon_sales, tue_sales, wed_sales,\n  thu_sales, fri_sales, sat_sales\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q44.sql",
    "content": "SELECT\n  asceding.rnk,\n  i1.i_product_name best_performing,\n  i2.i_product_name worst_performing\nFROM (SELECT *\nFROM (SELECT\n  item_sk,\n  rank()\n  OVER (\n    ORDER BY rank_col ASC) rnk\nFROM (SELECT\n  ss_item_sk item_sk,\n  avg(ss_net_profit) rank_col\nFROM store_sales ss1\nWHERE ss_store_sk = 4\nGROUP BY ss_item_sk\nHAVING avg(ss_net_profit) > 0.9 * (SELECT avg(ss_net_profit) rank_col\nFROM store_sales\nWHERE ss_store_sk = 4\n  AND ss_addr_sk IS NULL\nGROUP BY ss_store_sk)) V1) V11\nWHERE rnk < 11) asceding,\n  (SELECT *\n  FROM (SELECT\n    item_sk,\n    rank()\n    OVER (\n      ORDER BY rank_col DESC) rnk\n  FROM (SELECT\n    ss_item_sk item_sk,\n    avg(ss_net_profit) rank_col\n  FROM store_sales ss1\n  WHERE ss_store_sk = 4\n  GROUP BY ss_item_sk\n  HAVING avg(ss_net_profit) > 0.9 * (SELECT avg(ss_net_profit) rank_col\n  FROM store_sales\n  WHERE ss_store_sk = 4\n    AND ss_addr_sk IS NULL\n  GROUP BY ss_store_sk)) V2) V21\n  WHERE rnk < 11) descending,\n  item i1, item i2\nWHERE asceding.rnk = descending.rnk\n  AND i1.i_item_sk = asceding.item_sk\n  AND i2.i_item_sk = descending.item_sk\nORDER BY asceding.rnk\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q45.sql",
    "content": "SELECT\n  ca_zip,\n  ca_city,\n  sum(ws_sales_price)\nFROM web_sales, customer, customer_address, date_dim, item\nWHERE ws_bill_customer_sk = c_customer_sk\n  AND c_current_addr_sk = ca_address_sk\n  AND ws_item_sk = i_item_sk\n  AND (substr(ca_zip, 1, 5) IN\n  ('85669', '86197', '88274', '83405', '86475', '85392', '85460', '80348', '81792')\n  OR\n  i_item_id IN (SELECT i_item_id\n  FROM item\n  WHERE i_item_sk IN (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)\n  )\n)\n  AND ws_sold_date_sk = d_date_sk\n  AND d_qoy = 2 AND d_year = 2001\nGROUP BY ca_zip, ca_city\nORDER BY ca_zip, ca_city\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q46.sql",
    "content": "SELECT\n  c_last_name,\n  c_first_name,\n  ca_city,\n  bought_city,\n  ss_ticket_number,\n  amt,\n  profit\nFROM\n  (SELECT\n    ss_ticket_number,\n    ss_customer_sk,\n    ca_city bought_city,\n    sum(ss_coupon_amt) amt,\n    sum(ss_net_profit) profit\n  FROM store_sales, date_dim, store, household_demographics, customer_address\n  WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    AND store_sales.ss_store_sk = store.s_store_sk\n    AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND store_sales.ss_addr_sk = customer_address.ca_address_sk\n    AND (household_demographics.hd_dep_count = 4 OR\n    household_demographics.hd_vehicle_count = 3)\n    AND date_dim.d_dow IN (6, 0)\n    AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2)\n    AND store.s_city IN ('Fairview', 'Midway', 'Fairview', 'Fairview', 'Fairview')\n  GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, customer,\n  customer_address current_addr\nWHERE ss_customer_sk = c_customer_sk\n  AND customer.c_current_addr_sk = current_addr.ca_address_sk\n  AND current_addr.ca_city <> bought_city\nORDER BY c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q47.sql",
    "content": "WITH v1 AS (\n  SELECT\n    i_category,\n    i_brand,\n    s_store_name,\n    s_company_name,\n    d_year,\n    d_moy,\n    sum(ss_sales_price) sum_sales,\n    avg(sum(ss_sales_price))\n    OVER\n    (PARTITION BY i_category, i_brand,\n      s_store_name, s_company_name, d_year)\n    avg_monthly_sales,\n    rank()\n    OVER\n    (PARTITION BY i_category, i_brand,\n      s_store_name, s_company_name\n      ORDER BY d_year, d_moy) rn\n  FROM item, store_sales, date_dim, store\n  WHERE ss_item_sk = i_item_sk AND\n    ss_sold_date_sk = d_date_sk AND\n    ss_store_sk = s_store_sk AND\n    (\n      d_year = 1999 OR\n        (d_year = 1999 - 1 AND d_moy = 12) OR\n        (d_year = 1999 + 1 AND d_moy = 1)\n    )\n  GROUP BY i_category, i_brand,\n    s_store_name, s_company_name,\n    d_year, d_moy),\n    v2 AS (\n    SELECT\n      v1.i_category,\n      v1.i_brand,\n      v1.s_store_name,\n      v1.s_company_name,\n      v1.d_year,\n      v1.d_moy,\n      v1.avg_monthly_sales,\n      v1.sum_sales,\n      v1_lag.sum_sales psum,\n      v1_lead.sum_sales nsum\n    FROM v1, v1 v1_lag, v1 v1_lead\n    WHERE v1.i_category = v1_lag.i_category AND\n      v1.i_category = v1_lead.i_category AND\n      v1.i_brand = v1_lag.i_brand AND\n      v1.i_brand = v1_lead.i_brand AND\n      v1.s_store_name = v1_lag.s_store_name AND\n      v1.s_store_name = v1_lead.s_store_name AND\n      v1.s_company_name = v1_lag.s_company_name AND\n      v1.s_company_name = v1_lead.s_company_name AND\n      v1.rn = v1_lag.rn + 1 AND\n      v1.rn = v1_lead.rn - 1)\nSELECT *\nFROM v2\nWHERE d_year = 1999 AND\n  avg_monthly_sales > 0 AND\n  CASE WHEN avg_monthly_sales > 0\n    THEN abs(sum_sales - avg_monthly_sales) / avg_monthly_sales\n  ELSE NULL END > 0.1\nORDER BY sum_sales - avg_monthly_sales, 3\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q48.sql",
    "content": "SELECT sum(ss_quantity)\nFROM store_sales, store, customer_demographics, customer_address, date_dim\nWHERE s_store_sk = ss_store_sk\n  AND ss_sold_date_sk = d_date_sk AND d_year = 2001\n  AND\n  (\n    (\n      cd_demo_sk = ss_cdemo_sk\n        AND\n        cd_marital_status = 'M'\n        AND\n        cd_education_status = '4 yr Degree'\n        AND\n        ss_sales_price BETWEEN 100.00 AND 150.00\n    )\n      OR\n      (\n        cd_demo_sk = ss_cdemo_sk\n          AND\n          cd_marital_status = 'D'\n          AND\n          cd_education_status = '2 yr Degree'\n          AND\n          ss_sales_price BETWEEN 50.00 AND 100.00\n      )\n      OR\n      (\n        cd_demo_sk = ss_cdemo_sk\n          AND\n          cd_marital_status = 'S'\n          AND\n          cd_education_status = 'College'\n          AND\n          ss_sales_price BETWEEN 150.00 AND 200.00\n      )\n  )\n  AND\n  (\n    (\n      ss_addr_sk = ca_address_sk\n        AND\n        ca_country = 'United States'\n        AND\n        ca_state IN ('CO', 'OH', 'TX')\n        AND ss_net_profit BETWEEN 0 AND 2000\n    )\n      OR\n      (ss_addr_sk = ca_address_sk\n        AND\n        ca_country = 'United States'\n        AND\n        ca_state IN ('OR', 'MN', 'KY')\n        AND ss_net_profit BETWEEN 150 AND 3000\n      )\n      OR\n      (ss_addr_sk = ca_address_sk\n        AND\n        ca_country = 'United States'\n        AND\n        ca_state IN ('VA', 'CA', 'MS')\n        AND ss_net_profit BETWEEN 50 AND 25000\n      )\n  )\n"
  },
  {
    "path": "spark-queries-tpcds/q49.sql",
    "content": "SELECT\n  'web' AS channel,\n  web.item,\n  web.return_ratio,\n  web.return_rank,\n  web.currency_rank\nFROM (\n       SELECT\n         item,\n         return_ratio,\n         currency_ratio,\n         rank()\n         OVER (\n           ORDER BY return_ratio) AS return_rank,\n         rank()\n         OVER (\n           ORDER BY currency_ratio) AS currency_rank\n       FROM\n         (SELECT\n           ws.ws_item_sk AS item,\n           (cast(sum(coalesce(wr.wr_return_quantity, 0)) AS DECIMAL(15, 4)) /\n             cast(sum(coalesce(ws.ws_quantity, 0)) AS DECIMAL(15, 4))) AS return_ratio,\n           (cast(sum(coalesce(wr.wr_return_amt, 0)) AS DECIMAL(15, 4)) /\n             cast(sum(coalesce(ws.ws_net_paid, 0)) AS DECIMAL(15, 4))) AS currency_ratio\n         FROM\n           web_sales ws LEFT OUTER JOIN web_returns wr\n             ON (ws.ws_order_number = wr.wr_order_number AND\n             ws.ws_item_sk = wr.wr_item_sk)\n           , date_dim\n         WHERE\n           wr.wr_return_amt > 10000\n             AND ws.ws_net_profit > 1\n             AND ws.ws_net_paid > 0\n             AND ws.ws_quantity > 0\n             AND ws_sold_date_sk = d_date_sk\n             AND d_year = 2001\n             AND d_moy = 12\n         GROUP BY ws.ws_item_sk\n         ) in_web\n     ) web\nWHERE (web.return_rank <= 10 OR web.currency_rank <= 10)\nUNION\nSELECT\n  'catalog' AS channel,\n  catalog.item,\n  catalog.return_ratio,\n  catalog.return_rank,\n  catalog.currency_rank\nFROM (\n       SELECT\n         item,\n         return_ratio,\n         currency_ratio,\n         rank()\n         OVER (\n           ORDER BY return_ratio) AS return_rank,\n         rank()\n         OVER (\n           ORDER BY currency_ratio) AS currency_rank\n       FROM\n         (SELECT\n           cs.cs_item_sk AS item,\n           (cast(sum(coalesce(cr.cr_return_quantity, 0)) AS DECIMAL(15, 4)) /\n             cast(sum(coalesce(cs.cs_quantity, 0)) AS DECIMAL(15, 4))) AS return_ratio,\n           (cast(sum(coalesce(cr.cr_return_amount, 0)) AS DECIMAL(15, 4)) /\n             cast(sum(coalesce(cs.cs_net_paid, 0)) AS DECIMAL(15, 4))) AS currency_ratio\n         FROM\n           catalog_sales cs LEFT OUTER JOIN catalog_returns cr\n             ON (cs.cs_order_number = cr.cr_order_number AND\n             cs.cs_item_sk = cr.cr_item_sk)\n           , date_dim\n         WHERE\n           cr.cr_return_amount > 10000\n             AND cs.cs_net_profit > 1\n             AND cs.cs_net_paid > 0\n             AND cs.cs_quantity > 0\n             AND cs_sold_date_sk = d_date_sk\n             AND d_year = 2001\n             AND d_moy = 12\n         GROUP BY cs.cs_item_sk\n         ) in_cat\n     ) catalog\nWHERE (catalog.return_rank <= 10 OR catalog.currency_rank <= 10)\nUNION\nSELECT\n  'store' AS channel,\n  store.item,\n  store.return_ratio,\n  store.return_rank,\n  store.currency_rank\nFROM (\n       SELECT\n         item,\n         return_ratio,\n         currency_ratio,\n         rank()\n         OVER (\n           ORDER BY return_ratio) AS return_rank,\n         rank()\n         OVER (\n           ORDER BY currency_ratio) AS currency_rank\n       FROM\n         (SELECT\n           sts.ss_item_sk AS item,\n           (cast(sum(coalesce(sr.sr_return_quantity, 0)) AS DECIMAL(15, 4)) /\n             cast(sum(coalesce(sts.ss_quantity, 0)) AS DECIMAL(15, 4))) AS return_ratio,\n           (cast(sum(coalesce(sr.sr_return_amt, 0)) AS DECIMAL(15, 4)) /\n             cast(sum(coalesce(sts.ss_net_paid, 0)) AS DECIMAL(15, 4))) AS currency_ratio\n         FROM\n           store_sales sts LEFT OUTER JOIN store_returns sr\n             ON (sts.ss_ticket_number = sr.sr_ticket_number AND sts.ss_item_sk = sr.sr_item_sk)\n           , date_dim\n         WHERE\n           sr.sr_return_amt > 10000\n             AND sts.ss_net_profit > 1\n             AND sts.ss_net_paid > 0\n             AND sts.ss_quantity > 0\n             AND ss_sold_date_sk = d_date_sk\n             AND d_year = 2001\n             AND d_moy = 12\n         GROUP BY sts.ss_item_sk\n         ) in_store\n     ) store\nWHERE (store.return_rank <= 10 OR store.currency_rank <= 10)\nORDER BY 1, 4, 5\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q5.sql",
    "content": "WITH ssr AS\n( SELECT\n    s_store_id,\n    sum(sales_price) AS sales,\n    sum(profit) AS profit,\n    sum(return_amt) AS RETURNS,\n    sum(net_loss) AS profit_loss\n  FROM\n    (SELECT\n       ss_store_sk AS store_sk,\n       ss_sold_date_sk AS date_sk,\n       ss_ext_sales_price AS sales_price,\n       ss_net_profit AS profit,\n       cast(0 AS DECIMAL(7, 2)) AS return_amt,\n       cast(0 AS DECIMAL(7, 2)) AS net_loss\n     FROM store_sales\n     UNION ALL\n     SELECT\n       sr_store_sk AS store_sk,\n       sr_returned_date_sk AS date_sk,\n       cast(0 AS DECIMAL(7, 2)) AS sales_price,\n       cast(0 AS DECIMAL(7, 2)) AS profit,\n       sr_return_amt AS return_amt,\n       sr_net_loss AS net_loss\n     FROM store_returns)\n    salesreturns, date_dim, store\n  WHERE date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-23' AS DATE)\n  AND ((cast('2000-08-23' AS DATE) + INTERVAL 14 days))\n    AND store_sk = s_store_sk\n  GROUP BY s_store_id),\n    csr AS\n  ( SELECT\n    cp_catalog_page_id,\n    sum(sales_price) AS sales,\n    sum(profit) AS profit,\n    sum(return_amt) AS RETURNS,\n    sum(net_loss) AS profit_loss\n  FROM\n    (SELECT\n       cs_catalog_page_sk AS page_sk,\n       cs_sold_date_sk AS date_sk,\n       cs_ext_sales_price AS sales_price,\n       cs_net_profit AS profit,\n       cast(0 AS DECIMAL(7, 2)) AS return_amt,\n       cast(0 AS DECIMAL(7, 2)) AS net_loss\n     FROM catalog_sales\n     UNION ALL\n     SELECT\n       cr_catalog_page_sk AS page_sk,\n       cr_returned_date_sk AS date_sk,\n       cast(0 AS DECIMAL(7, 2)) AS sales_price,\n       cast(0 AS DECIMAL(7, 2)) AS profit,\n       cr_return_amount AS return_amt,\n       cr_net_loss AS net_loss\n     FROM catalog_returns\n    ) salesreturns, date_dim, catalog_page\n  WHERE date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-23' AS DATE)\n  AND ((cast('2000-08-23' AS DATE) + INTERVAL 14 days))\n    AND page_sk = cp_catalog_page_sk\n  GROUP BY cp_catalog_page_id)\n  ,\n    wsr AS\n  ( SELECT\n    web_site_id,\n    sum(sales_price) AS sales,\n    sum(profit) AS profit,\n    sum(return_amt) AS RETURNS,\n    sum(net_loss) AS profit_loss\n  FROM\n    (SELECT\n       ws_web_site_sk AS wsr_web_site_sk,\n       ws_sold_date_sk AS date_sk,\n       ws_ext_sales_price AS sales_price,\n       ws_net_profit AS profit,\n       cast(0 AS DECIMAL(7, 2)) AS return_amt,\n       cast(0 AS DECIMAL(7, 2)) AS net_loss\n     FROM web_sales\n     UNION ALL\n     SELECT\n       ws_web_site_sk AS wsr_web_site_sk,\n       wr_returned_date_sk AS date_sk,\n       cast(0 AS DECIMAL(7, 2)) AS sales_price,\n       cast(0 AS DECIMAL(7, 2)) AS profit,\n       wr_return_amt AS return_amt,\n       wr_net_loss AS net_loss\n     FROM web_returns\n       LEFT OUTER JOIN web_sales ON\n                                   (wr_item_sk = ws_item_sk\n                                     AND wr_order_number = ws_order_number)\n    ) salesreturns, date_dim, web_site\n  WHERE date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-23' AS DATE)\n  AND ((cast('2000-08-23' AS DATE) + INTERVAL 14 days))\n    AND wsr_web_site_sk = web_site_sk\n  GROUP BY web_site_id)\nSELECT\n  channel,\n  id,\n  sum(sales) AS sales,\n  sum(returns) AS returns,\n  sum(profit) AS profit\nFROM\n  (SELECT\n     'store channel' AS channel,\n     concat('store', s_store_id) AS id,\n     sales,\n     returns,\n     (profit - profit_loss) AS profit\n   FROM ssr\n   UNION ALL\n   SELECT\n     'catalog channel' AS channel,\n     concat('catalog_page', cp_catalog_page_id) AS id,\n     sales,\n     returns,\n     (profit - profit_loss) AS profit\n   FROM csr\n   UNION ALL\n   SELECT\n     'web channel' AS channel,\n     concat('web_site', web_site_id) AS id,\n     sales,\n     returns,\n     (profit - profit_loss) AS profit\n   FROM wsr\n  ) x\nGROUP BY ROLLUP (channel, id)\nORDER BY channel, id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q50.sql",
    "content": "SELECT\n  s_store_name,\n  s_company_id,\n  s_street_number,\n  s_street_name,\n  s_street_type,\n  s_suite_number,\n  s_city,\n  s_county,\n  s_state,\n  s_zip,\n  sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk <= 30)\n    THEN 1\n      ELSE 0 END)  AS `30 days `,\n  sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 30) AND\n    (sr_returned_date_sk - ss_sold_date_sk <= 60)\n    THEN 1\n      ELSE 0 END)  AS `31 - 60 days `,\n  sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 60) AND\n    (sr_returned_date_sk - ss_sold_date_sk <= 90)\n    THEN 1\n      ELSE 0 END)  AS `61 - 90 days `,\n  sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 90) AND\n    (sr_returned_date_sk - ss_sold_date_sk <= 120)\n    THEN 1\n      ELSE 0 END)  AS `91 - 120 days `,\n  sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 120)\n    THEN 1\n      ELSE 0 END)  AS `>120 days `\nFROM\n  store_sales, store_returns, store, date_dim d1, date_dim d2\nWHERE\n  d2.d_year = 2001\n    AND d2.d_moy = 8\n    AND ss_ticket_number = sr_ticket_number\n    AND ss_item_sk = sr_item_sk\n    AND ss_sold_date_sk = d1.d_date_sk\n    AND sr_returned_date_sk = d2.d_date_sk\n    AND ss_customer_sk = sr_customer_sk\n    AND ss_store_sk = s_store_sk\nGROUP BY\n  s_store_name, s_company_id, s_street_number, s_street_name, s_street_type,\n  s_suite_number, s_city, s_county, s_state, s_zip\nORDER BY\n  s_store_name, s_company_id, s_street_number, s_street_name, s_street_type,\n  s_suite_number, s_city, s_county, s_state, s_zip\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q51.sql",
    "content": "WITH web_v1 AS (\n  SELECT\n    ws_item_sk item_sk,\n    d_date,\n    sum(sum(ws_sales_price))\n    OVER (PARTITION BY ws_item_sk\n      ORDER BY d_date\n      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) cume_sales\n  FROM web_sales, date_dim\n  WHERE ws_sold_date_sk = d_date_sk\n    AND d_month_seq BETWEEN 1200 AND 1200 + 11\n    AND ws_item_sk IS NOT NULL\n  GROUP BY ws_item_sk, d_date),\n    store_v1 AS (\n    SELECT\n      ss_item_sk item_sk,\n      d_date,\n      sum(sum(ss_sales_price))\n      OVER (PARTITION BY ss_item_sk\n        ORDER BY d_date\n        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) cume_sales\n    FROM store_sales, date_dim\n    WHERE ss_sold_date_sk = d_date_sk\n      AND d_month_seq BETWEEN 1200 AND 1200 + 11\n      AND ss_item_sk IS NOT NULL\n    GROUP BY ss_item_sk, d_date)\nSELECT *\nFROM (SELECT\n  item_sk,\n  d_date,\n  web_sales,\n  store_sales,\n  max(web_sales)\n  OVER (PARTITION BY item_sk\n    ORDER BY d_date\n    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) web_cumulative,\n  max(store_sales)\n  OVER (PARTITION BY item_sk\n    ORDER BY d_date\n    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) store_cumulative\nFROM (SELECT\n  CASE WHEN web.item_sk IS NOT NULL\n    THEN web.item_sk\n  ELSE store.item_sk END item_sk,\n  CASE WHEN web.d_date IS NOT NULL\n    THEN web.d_date\n  ELSE store.d_date END d_date,\n  web.cume_sales web_sales,\n  store.cume_sales store_sales\nFROM web_v1 web FULL OUTER JOIN store_v1 store ON (web.item_sk = store.item_sk\n  AND web.d_date = store.d_date)\n     ) x) y\nWHERE web_cumulative > store_cumulative\nORDER BY item_sk, d_date\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q52.sql",
    "content": "SELECT\n  dt.d_year,\n  item.i_brand_id brand_id,\n  item.i_brand brand,\n  sum(ss_ext_sales_price) ext_price\nFROM date_dim dt, store_sales, item\nWHERE dt.d_date_sk = store_sales.ss_sold_date_sk\n  AND store_sales.ss_item_sk = item.i_item_sk\n  AND item.i_manager_id = 1\n  AND dt.d_moy = 11\n  AND dt.d_year = 2000\nGROUP BY dt.d_year, item.i_brand, item.i_brand_id\nORDER BY dt.d_year, ext_price DESC, brand_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q53.sql",
    "content": "SELECT *\nFROM\n  (SELECT\n    i_manufact_id,\n    sum(ss_sales_price) sum_sales,\n    avg(sum(ss_sales_price))\n    OVER (PARTITION BY i_manufact_id) avg_quarterly_sales\n  FROM item, store_sales, date_dim, store\n  WHERE ss_item_sk = i_item_sk AND\n    ss_sold_date_sk = d_date_sk AND\n    ss_store_sk = s_store_sk AND\n    d_month_seq IN (1200, 1200 + 1, 1200 + 2, 1200 + 3, 1200 + 4, 1200 + 5, 1200 + 6,\n                          1200 + 7, 1200 + 8, 1200 + 9, 1200 + 10, 1200 + 11) AND\n    ((i_category IN ('Books', 'Children', 'Electronics') AND\n      i_class IN ('personal', 'portable', 'reference', 'self-help') AND\n      i_brand IN ('scholaramalgamalg #14', 'scholaramalgamalg #7',\n                  'exportiunivamalg #9', 'scholaramalgamalg #9'))\n      OR\n      (i_category IN ('Women', 'Music', 'Men') AND\n        i_class IN ('accessories', 'classical', 'fragrances', 'pants') AND\n        i_brand IN ('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1',\n                    'importoamalg #1')))\n  GROUP BY i_manufact_id, d_qoy) tmp1\nWHERE CASE WHEN avg_quarterly_sales > 0\n  THEN abs(sum_sales - avg_quarterly_sales) / avg_quarterly_sales\n      ELSE NULL END > 0.1\nORDER BY avg_quarterly_sales,\n  sum_sales,\n  i_manufact_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q54.sql",
    "content": "WITH my_customers AS (\n  SELECT DISTINCT\n    c_customer_sk,\n    c_current_addr_sk\n  FROM\n    (SELECT\n       cs_sold_date_sk sold_date_sk,\n       cs_bill_customer_sk customer_sk,\n       cs_item_sk item_sk\n     FROM catalog_sales\n     UNION ALL\n     SELECT\n       ws_sold_date_sk sold_date_sk,\n       ws_bill_customer_sk customer_sk,\n       ws_item_sk item_sk\n     FROM web_sales\n    ) cs_or_ws_sales,\n    item,\n    date_dim,\n    customer\n  WHERE sold_date_sk = d_date_sk\n    AND item_sk = i_item_sk\n    AND i_category = 'Women'\n    AND i_class = 'maternity'\n    AND c_customer_sk = cs_or_ws_sales.customer_sk\n    AND d_moy = 12\n    AND d_year = 1998\n)\n  , my_revenue AS (\n  SELECT\n    c_customer_sk,\n    sum(ss_ext_sales_price) AS revenue\n  FROM my_customers,\n    store_sales,\n    customer_address,\n    store,\n    date_dim\n  WHERE c_current_addr_sk = ca_address_sk\n    AND ca_county = s_county\n    AND ca_state = s_state\n    AND ss_sold_date_sk = d_date_sk\n    AND c_customer_sk = ss_customer_sk\n    AND d_month_seq BETWEEN (SELECT DISTINCT d_month_seq + 1\n  FROM date_dim\n  WHERE d_year = 1998 AND d_moy = 12)\n  AND (SELECT DISTINCT d_month_seq + 3\n  FROM date_dim\n  WHERE d_year = 1998 AND d_moy = 12)\n  GROUP BY c_customer_sk\n)\n  , segments AS\n(SELECT cast((revenue / 50) AS INT) AS segment\n  FROM my_revenue)\nSELECT\n  segment,\n  count(*) AS num_customers,\n  segment * 50 AS segment_base\nFROM segments\nGROUP BY segment\nORDER BY segment, num_customers\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q55.sql",
    "content": "SELECT\n  i_brand_id brand_id,\n  i_brand brand,\n  sum(ss_ext_sales_price) ext_price\nFROM date_dim, store_sales, item\nWHERE d_date_sk = ss_sold_date_sk\n  AND ss_item_sk = i_item_sk\n  AND i_manager_id = 28\n  AND d_moy = 11\n  AND d_year = 1999\nGROUP BY i_brand, i_brand_id\nORDER BY ext_price DESC, brand_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q56.sql",
    "content": "WITH ss AS (\n  SELECT\n    i_item_id,\n    sum(ss_ext_sales_price) total_sales\n  FROM\n    store_sales, date_dim, customer_address, item\n  WHERE\n    i_item_id IN (SELECT i_item_id\n    FROM item\n    WHERE i_color IN ('slate', 'blanched', 'burnished'))\n      AND ss_item_sk = i_item_sk\n      AND ss_sold_date_sk = d_date_sk\n      AND d_year = 2001\n      AND d_moy = 2\n      AND ss_addr_sk = ca_address_sk\n      AND ca_gmt_offset = -5\n  GROUP BY i_item_id),\n    cs AS (\n    SELECT\n      i_item_id,\n      sum(cs_ext_sales_price) total_sales\n    FROM\n      catalog_sales, date_dim, customer_address, item\n    WHERE\n      i_item_id IN (SELECT i_item_id\n      FROM item\n      WHERE i_color IN ('slate', 'blanched', 'burnished'))\n        AND cs_item_sk = i_item_sk\n        AND cs_sold_date_sk = d_date_sk\n        AND d_year = 2001\n        AND d_moy = 2\n        AND cs_bill_addr_sk = ca_address_sk\n        AND ca_gmt_offset = -5\n    GROUP BY i_item_id),\n    ws AS (\n    SELECT\n      i_item_id,\n      sum(ws_ext_sales_price) total_sales\n    FROM\n      web_sales, date_dim, customer_address, item\n    WHERE\n      i_item_id IN (SELECT i_item_id\n      FROM item\n      WHERE i_color IN ('slate', 'blanched', 'burnished'))\n        AND ws_item_sk = i_item_sk\n        AND ws_sold_date_sk = d_date_sk\n        AND d_year = 2001\n        AND d_moy = 2\n        AND ws_bill_addr_sk = ca_address_sk\n        AND ca_gmt_offset = -5\n    GROUP BY i_item_id)\nSELECT\n  i_item_id,\n  sum(total_sales) total_sales\nFROM (SELECT *\n      FROM ss\n      UNION ALL\n      SELECT *\n      FROM cs\n      UNION ALL\n      SELECT *\n      FROM ws) tmp1\nGROUP BY i_item_id\nORDER BY total_sales\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q57.sql",
    "content": "WITH v1 AS (\n  SELECT\n    i_category,\n    i_brand,\n    cc_name,\n    d_year,\n    d_moy,\n    sum(cs_sales_price) sum_sales,\n    avg(sum(cs_sales_price))\n    OVER\n    (PARTITION BY i_category, i_brand, cc_name, d_year)\n    avg_monthly_sales,\n    rank()\n    OVER\n    (PARTITION BY i_category, i_brand, cc_name\n      ORDER BY d_year, d_moy) rn\n  FROM item, catalog_sales, date_dim, call_center\n  WHERE cs_item_sk = i_item_sk AND\n    cs_sold_date_sk = d_date_sk AND\n    cc_call_center_sk = cs_call_center_sk AND\n    (\n      d_year = 1999 OR\n        (d_year = 1999 - 1 AND d_moy = 12) OR\n        (d_year = 1999 + 1 AND d_moy = 1)\n    )\n  GROUP BY i_category, i_brand,\n    cc_name, d_year, d_moy),\n    v2 AS (\n    SELECT\n      v1.i_category,\n      v1.i_brand,\n      v1.cc_name,\n      v1.d_year,\n      v1.d_moy,\n      v1.avg_monthly_sales,\n      v1.sum_sales,\n      v1_lag.sum_sales psum,\n      v1_lead.sum_sales nsum\n    FROM v1, v1 v1_lag, v1 v1_lead\n    WHERE v1.i_category = v1_lag.i_category AND\n      v1.i_category = v1_lead.i_category AND\n      v1.i_brand = v1_lag.i_brand AND\n      v1.i_brand = v1_lead.i_brand AND\n      v1.cc_name = v1_lag.cc_name AND\n      v1.cc_name = v1_lead.cc_name AND\n      v1.rn = v1_lag.rn + 1 AND\n      v1.rn = v1_lead.rn - 1)\nSELECT *\nFROM v2\nWHERE d_year = 1999 AND\n  avg_monthly_sales > 0 AND\n  CASE WHEN avg_monthly_sales > 0\n    THEN abs(sum_sales - avg_monthly_sales) / avg_monthly_sales\n  ELSE NULL END > 0.1\nORDER BY sum_sales - avg_monthly_sales, 3\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q58.sql",
    "content": "WITH ss_items AS\n(SELECT\n    i_item_id item_id,\n    sum(ss_ext_sales_price) ss_item_rev\n  FROM store_sales, item, date_dim\n  WHERE ss_item_sk = i_item_sk\n    AND d_date IN (SELECT d_date\n  FROM date_dim\n  WHERE d_week_seq = (SELECT d_week_seq\n  FROM date_dim\n  WHERE d_date = '2000-01-03'))\n    AND ss_sold_date_sk = d_date_sk\n  GROUP BY i_item_id),\n    cs_items AS\n  (SELECT\n    i_item_id item_id,\n    sum(cs_ext_sales_price) cs_item_rev\n  FROM catalog_sales, item, date_dim\n  WHERE cs_item_sk = i_item_sk\n    AND d_date IN (SELECT d_date\n  FROM date_dim\n  WHERE d_week_seq = (SELECT d_week_seq\n  FROM date_dim\n  WHERE d_date = '2000-01-03'))\n    AND cs_sold_date_sk = d_date_sk\n  GROUP BY i_item_id),\n    ws_items AS\n  (SELECT\n    i_item_id item_id,\n    sum(ws_ext_sales_price) ws_item_rev\n  FROM web_sales, item, date_dim\n  WHERE ws_item_sk = i_item_sk\n    AND d_date IN (SELECT d_date\n  FROM date_dim\n  WHERE d_week_seq = (SELECT d_week_seq\n  FROM date_dim\n  WHERE d_date = '2000-01-03'))\n    AND ws_sold_date_sk = d_date_sk\n  GROUP BY i_item_id)\nSELECT\n  ss_items.item_id,\n  ss_item_rev,\n  ss_item_rev / (ss_item_rev + cs_item_rev + ws_item_rev) / 3 * 100 ss_dev,\n  cs_item_rev,\n  cs_item_rev / (ss_item_rev + cs_item_rev + ws_item_rev) / 3 * 100 cs_dev,\n  ws_item_rev,\n  ws_item_rev / (ss_item_rev + cs_item_rev + ws_item_rev) / 3 * 100 ws_dev,\n  (ss_item_rev + cs_item_rev + ws_item_rev) / 3 average\nFROM ss_items, cs_items, ws_items\nWHERE ss_items.item_id = cs_items.item_id\n  AND ss_items.item_id = ws_items.item_id\n  AND ss_item_rev BETWEEN 0.9 * cs_item_rev AND 1.1 * cs_item_rev\n  AND ss_item_rev BETWEEN 0.9 * ws_item_rev AND 1.1 * ws_item_rev\n  AND cs_item_rev BETWEEN 0.9 * ss_item_rev AND 1.1 * ss_item_rev\n  AND cs_item_rev BETWEEN 0.9 * ws_item_rev AND 1.1 * ws_item_rev\n  AND ws_item_rev BETWEEN 0.9 * ss_item_rev AND 1.1 * ss_item_rev\n  AND ws_item_rev BETWEEN 0.9 * cs_item_rev AND 1.1 * cs_item_rev\nORDER BY item_id, ss_item_rev\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q59.sql",
    "content": "WITH wss AS\n(SELECT\n    d_week_seq,\n    ss_store_sk,\n    sum(CASE WHEN (d_day_name = 'Sunday')\n      THEN ss_sales_price\n        ELSE NULL END) sun_sales,\n    sum(CASE WHEN (d_day_name = 'Monday')\n      THEN ss_sales_price\n        ELSE NULL END) mon_sales,\n    sum(CASE WHEN (d_day_name = 'Tuesday')\n      THEN ss_sales_price\n        ELSE NULL END) tue_sales,\n    sum(CASE WHEN (d_day_name = 'Wednesday')\n      THEN ss_sales_price\n        ELSE NULL END) wed_sales,\n    sum(CASE WHEN (d_day_name = 'Thursday')\n      THEN ss_sales_price\n        ELSE NULL END) thu_sales,\n    sum(CASE WHEN (d_day_name = 'Friday')\n      THEN ss_sales_price\n        ELSE NULL END) fri_sales,\n    sum(CASE WHEN (d_day_name = 'Saturday')\n      THEN ss_sales_price\n        ELSE NULL END) sat_sales\n  FROM store_sales, date_dim\n  WHERE d_date_sk = ss_sold_date_sk\n  GROUP BY d_week_seq, ss_store_sk\n)\nSELECT\n  s_store_name1,\n  s_store_id1,\n  d_week_seq1,\n  sun_sales1 / sun_sales2,\n  mon_sales1 / mon_sales2,\n  tue_sales1 / tue_sales2,\n  wed_sales1 / wed_sales2,\n  thu_sales1 / thu_sales2,\n  fri_sales1 / fri_sales2,\n  sat_sales1 / sat_sales2\nFROM\n  (SELECT\n    s_store_name s_store_name1,\n    wss.d_week_seq d_week_seq1,\n    s_store_id s_store_id1,\n    sun_sales sun_sales1,\n    mon_sales mon_sales1,\n    tue_sales tue_sales1,\n    wed_sales wed_sales1,\n    thu_sales thu_sales1,\n    fri_sales fri_sales1,\n    sat_sales sat_sales1\n  FROM wss, store, date_dim d\n  WHERE d.d_week_seq = wss.d_week_seq AND\n    ss_store_sk = s_store_sk AND\n    d_month_seq BETWEEN 1212 AND 1212 + 11) y,\n  (SELECT\n    s_store_name s_store_name2,\n    wss.d_week_seq d_week_seq2,\n    s_store_id s_store_id2,\n    sun_sales sun_sales2,\n    mon_sales mon_sales2,\n    tue_sales tue_sales2,\n    wed_sales wed_sales2,\n    thu_sales thu_sales2,\n    fri_sales fri_sales2,\n    sat_sales sat_sales2\n  FROM wss, store, date_dim d\n  WHERE d.d_week_seq = wss.d_week_seq AND\n    ss_store_sk = s_store_sk AND\n    d_month_seq BETWEEN 1212 + 12 AND 1212 + 23) x\nWHERE s_store_id1 = s_store_id2\n  AND d_week_seq1 = d_week_seq2 - 52\nORDER BY s_store_name1, s_store_id1, d_week_seq1\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q6.sql",
    "content": "SELECT\n  a.ca_state state,\n  count(*) cnt\nFROM\n  customer_address a, customer c, store_sales s, date_dim d, item i\nWHERE a.ca_address_sk = c.c_current_addr_sk\n  AND c.c_customer_sk = s.ss_customer_sk\n  AND s.ss_sold_date_sk = d.d_date_sk\n  AND s.ss_item_sk = i.i_item_sk\n  AND d.d_month_seq =\n  (SELECT DISTINCT (d_month_seq)\n  FROM date_dim\n  WHERE d_year = 2000 AND d_moy = 1)\n  AND i.i_current_price > 1.2 *\n  (SELECT avg(j.i_current_price)\n  FROM item j\n  WHERE j.i_category = i.i_category)\nGROUP BY a.ca_state\nHAVING count(*) >= 10\nORDER BY cnt\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q60.sql",
    "content": "WITH ss AS (\n  SELECT\n    i_item_id,\n    sum(ss_ext_sales_price) total_sales\n  FROM store_sales, date_dim, customer_address, item\n  WHERE\n    i_item_id IN (SELECT i_item_id\n    FROM item\n    WHERE i_category IN ('Music'))\n      AND ss_item_sk = i_item_sk\n      AND ss_sold_date_sk = d_date_sk\n      AND d_year = 1998\n      AND d_moy = 9\n      AND ss_addr_sk = ca_address_sk\n      AND ca_gmt_offset = -5\n  GROUP BY i_item_id),\n    cs AS (\n    SELECT\n      i_item_id,\n      sum(cs_ext_sales_price) total_sales\n    FROM catalog_sales, date_dim, customer_address, item\n    WHERE\n      i_item_id IN (SELECT i_item_id\n      FROM item\n      WHERE i_category IN ('Music'))\n        AND cs_item_sk = i_item_sk\n        AND cs_sold_date_sk = d_date_sk\n        AND d_year = 1998\n        AND d_moy = 9\n        AND cs_bill_addr_sk = ca_address_sk\n        AND ca_gmt_offset = -5\n    GROUP BY i_item_id),\n    ws AS (\n    SELECT\n      i_item_id,\n      sum(ws_ext_sales_price) total_sales\n    FROM web_sales, date_dim, customer_address, item\n    WHERE\n      i_item_id IN (SELECT i_item_id\n      FROM item\n      WHERE i_category IN ('Music'))\n        AND ws_item_sk = i_item_sk\n        AND ws_sold_date_sk = d_date_sk\n        AND d_year = 1998\n        AND d_moy = 9\n        AND ws_bill_addr_sk = ca_address_sk\n        AND ca_gmt_offset = -5\n    GROUP BY i_item_id)\nSELECT\n  i_item_id,\n  sum(total_sales) total_sales\nFROM (SELECT *\n      FROM ss\n      UNION ALL\n      SELECT *\n      FROM cs\n      UNION ALL\n      SELECT *\n      FROM ws) tmp1\nGROUP BY i_item_id\nORDER BY i_item_id, total_sales\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q61.sql",
    "content": "SELECT\n  promotions,\n  total,\n  cast(promotions AS DECIMAL(15, 4)) / cast(total AS DECIMAL(15, 4)) * 100\nFROM\n  (SELECT sum(ss_ext_sales_price) promotions\n  FROM store_sales, store, promotion, date_dim, customer, customer_address, item\n  WHERE ss_sold_date_sk = d_date_sk\n    AND ss_store_sk = s_store_sk\n    AND ss_promo_sk = p_promo_sk\n    AND ss_customer_sk = c_customer_sk\n    AND ca_address_sk = c_current_addr_sk\n    AND ss_item_sk = i_item_sk\n    AND ca_gmt_offset = -5\n    AND i_category = 'Jewelry'\n    AND (p_channel_dmail = 'Y' OR p_channel_email = 'Y' OR p_channel_tv = 'Y')\n    AND s_gmt_offset = -5\n    AND d_year = 1998\n    AND d_moy = 11) promotional_sales,\n  (SELECT sum(ss_ext_sales_price) total\n  FROM store_sales, store, date_dim, customer, customer_address, item\n  WHERE ss_sold_date_sk = d_date_sk\n    AND ss_store_sk = s_store_sk\n    AND ss_customer_sk = c_customer_sk\n    AND ca_address_sk = c_current_addr_sk\n    AND ss_item_sk = i_item_sk\n    AND ca_gmt_offset = -5\n    AND i_category = 'Jewelry'\n    AND s_gmt_offset = -5\n    AND d_year = 1998\n    AND d_moy = 11) all_sales\nORDER BY promotions, total\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q62.sql",
    "content": "SELECT\n  substr(w_warehouse_name, 1, 20),\n  sm_type,\n  web_name,\n  sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk <= 30)\n    THEN 1\n      ELSE 0 END)  AS `30 days `,\n  sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 30) AND\n    (ws_ship_date_sk - ws_sold_date_sk <= 60)\n    THEN 1\n      ELSE 0 END)  AS `31 - 60 days `,\n  sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 60) AND\n    (ws_ship_date_sk - ws_sold_date_sk <= 90)\n    THEN 1\n      ELSE 0 END)  AS `61 - 90 days `,\n  sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 90) AND\n    (ws_ship_date_sk - ws_sold_date_sk <= 120)\n    THEN 1\n      ELSE 0 END)  AS `91 - 120 days `,\n  sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 120)\n    THEN 1\n      ELSE 0 END)  AS `>120 days `\nFROM\n  web_sales, warehouse, ship_mode, web_site, date_dim\nWHERE\n  d_month_seq BETWEEN 1200 AND 1200 + 11\n    AND ws_ship_date_sk = d_date_sk\n    AND ws_warehouse_sk = w_warehouse_sk\n    AND ws_ship_mode_sk = sm_ship_mode_sk\n    AND ws_web_site_sk = web_site_sk\nGROUP BY\n  substr(w_warehouse_name, 1, 20), sm_type, web_name\nORDER BY\n  substr(w_warehouse_name, 1, 20), sm_type, web_name\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q63.sql",
    "content": "SELECT *\nFROM (SELECT\n  i_manager_id,\n  sum(ss_sales_price) sum_sales,\n  avg(sum(ss_sales_price))\n  OVER (PARTITION BY i_manager_id) avg_monthly_sales\nFROM item\n  , store_sales\n  , date_dim\n  , store\nWHERE ss_item_sk = i_item_sk\n  AND ss_sold_date_sk = d_date_sk\n  AND ss_store_sk = s_store_sk\n  AND d_month_seq IN (1200, 1200 + 1, 1200 + 2, 1200 + 3, 1200 + 4, 1200 + 5, 1200 + 6, 1200 + 7,\n                            1200 + 8, 1200 + 9, 1200 + 10, 1200 + 11)\n  AND ((i_category IN ('Books', 'Children', 'Electronics')\n  AND i_class IN ('personal', 'portable', 'refernece', 'self-help')\n  AND i_brand IN ('scholaramalgamalg #14', 'scholaramalgamalg #7',\n                  'exportiunivamalg #9', 'scholaramalgamalg #9'))\n  OR (i_category IN ('Women', 'Music', 'Men')\n  AND i_class IN ('accessories', 'classical', 'fragrances', 'pants')\n  AND i_brand IN ('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1',\n                  'importoamalg #1')))\nGROUP BY i_manager_id, d_moy) tmp1\nWHERE CASE WHEN avg_monthly_sales > 0\n  THEN abs(sum_sales - avg_monthly_sales) / avg_monthly_sales\n      ELSE NULL END > 0.1\nORDER BY i_manager_id\n  , avg_monthly_sales\n  , sum_sales\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q64.sql",
    "content": "WITH cs_ui AS\n(SELECT\n    cs_item_sk,\n    sum(cs_ext_list_price) AS sale,\n    sum(cr_refunded_cash + cr_reversed_charge + cr_store_credit) AS refund\n  FROM catalog_sales\n    , catalog_returns\n  WHERE cs_item_sk = cr_item_sk\n    AND cs_order_number = cr_order_number\n  GROUP BY cs_item_sk\n  HAVING sum(cs_ext_list_price) > 2 * sum(cr_refunded_cash + cr_reversed_charge + cr_store_credit)),\n    cross_sales AS\n  (SELECT\n    i_product_name product_name,\n    i_item_sk item_sk,\n    s_store_name store_name,\n    s_zip store_zip,\n    ad1.ca_street_number b_street_number,\n    ad1.ca_street_name b_streen_name,\n    ad1.ca_city b_city,\n    ad1.ca_zip b_zip,\n    ad2.ca_street_number c_street_number,\n    ad2.ca_street_name c_street_name,\n    ad2.ca_city c_city,\n    ad2.ca_zip c_zip,\n    d1.d_year AS syear,\n    d2.d_year AS fsyear,\n    d3.d_year s2year,\n    count(*) cnt,\n    sum(ss_wholesale_cost) s1,\n    sum(ss_list_price) s2,\n    sum(ss_coupon_amt) s3\n  FROM store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, date_dim d3,\n    store, customer, customer_demographics cd1, customer_demographics cd2,\n    promotion, household_demographics hd1, household_demographics hd2,\n    customer_address ad1, customer_address ad2, income_band ib1, income_band ib2, item\n  WHERE ss_store_sk = s_store_sk AND\n    ss_sold_date_sk = d1.d_date_sk AND\n    ss_customer_sk = c_customer_sk AND\n    ss_cdemo_sk = cd1.cd_demo_sk AND\n    ss_hdemo_sk = hd1.hd_demo_sk AND\n    ss_addr_sk = ad1.ca_address_sk AND\n    ss_item_sk = i_item_sk AND\n    ss_item_sk = sr_item_sk AND\n    ss_ticket_number = sr_ticket_number AND\n    ss_item_sk = cs_ui.cs_item_sk AND\n    c_current_cdemo_sk = cd2.cd_demo_sk AND\n    c_current_hdemo_sk = hd2.hd_demo_sk AND\n    c_current_addr_sk = ad2.ca_address_sk AND\n    c_first_sales_date_sk = d2.d_date_sk AND\n    c_first_shipto_date_sk = d3.d_date_sk AND\n    ss_promo_sk = p_promo_sk AND\n    hd1.hd_income_band_sk = ib1.ib_income_band_sk AND\n    hd2.hd_income_band_sk = ib2.ib_income_band_sk AND\n    cd1.cd_marital_status <> cd2.cd_marital_status AND\n    i_color IN ('purple', 'burlywood', 'indian', 'spring', 'floral', 'medium') AND\n    i_current_price BETWEEN 64 AND 64 + 10 AND\n    i_current_price BETWEEN 64 + 1 AND 64 + 15\n  GROUP BY i_product_name, i_item_sk, s_store_name, s_zip, ad1.ca_street_number,\n    ad1.ca_street_name, ad1.ca_city, ad1.ca_zip, ad2.ca_street_number,\n    ad2.ca_street_name, ad2.ca_city, ad2.ca_zip, d1.d_year, d2.d_year, d3.d_year\n  )\nSELECT\n  cs1.product_name,\n  cs1.store_name,\n  cs1.store_zip,\n  cs1.b_street_number,\n  cs1.b_streen_name,\n  cs1.b_city,\n  cs1.b_zip,\n  cs1.c_street_number,\n  cs1.c_street_name,\n  cs1.c_city,\n  cs1.c_zip,\n  cs1.syear,\n  cs1.cnt,\n  cs1.s1,\n  cs1.s2,\n  cs1.s3,\n  cs2.s1,\n  cs2.s2,\n  cs2.s3,\n  cs2.syear,\n  cs2.cnt\nFROM cross_sales cs1, cross_sales cs2\nWHERE cs1.item_sk = cs2.item_sk AND\n  cs1.syear = 1999 AND\n  cs2.syear = 1999 + 1 AND\n  cs2.cnt <= cs1.cnt AND\n  cs1.store_name = cs2.store_name AND\n  cs1.store_zip = cs2.store_zip\nORDER BY cs1.product_name, cs1.store_name, cs2.cnt\n"
  },
  {
    "path": "spark-queries-tpcds/q65.sql",
    "content": "SELECT\n  s_store_name,\n  i_item_desc,\n  sc.revenue,\n  i_current_price,\n  i_wholesale_cost,\n  i_brand\nFROM store, item,\n  (SELECT\n    ss_store_sk,\n    avg(revenue) AS ave\n  FROM\n    (SELECT\n      ss_store_sk,\n      ss_item_sk,\n      sum(ss_sales_price) AS revenue\n    FROM store_sales, date_dim\n    WHERE ss_sold_date_sk = d_date_sk AND d_month_seq BETWEEN 1176 AND 1176 + 11\n    GROUP BY ss_store_sk, ss_item_sk) sa\n  GROUP BY ss_store_sk) sb,\n  (SELECT\n    ss_store_sk,\n    ss_item_sk,\n    sum(ss_sales_price) AS revenue\n  FROM store_sales, date_dim\n  WHERE ss_sold_date_sk = d_date_sk AND d_month_seq BETWEEN 1176 AND 1176 + 11\n  GROUP BY ss_store_sk, ss_item_sk) sc\nWHERE sb.ss_store_sk = sc.ss_store_sk AND\n  sc.revenue <= 0.1 * sb.ave AND\n  s_store_sk = sc.ss_store_sk AND\n  i_item_sk = sc.ss_item_sk\nORDER BY s_store_name, i_item_desc\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q66.sql",
    "content": "SELECT\n  w_warehouse_name,\n  w_warehouse_sq_ft,\n  w_city,\n  w_county,\n  w_state,\n  w_country,\n  ship_carriers,\n  year,\n  sum(jan_sales) AS jan_sales,\n  sum(feb_sales) AS feb_sales,\n  sum(mar_sales) AS mar_sales,\n  sum(apr_sales) AS apr_sales,\n  sum(may_sales) AS may_sales,\n  sum(jun_sales) AS jun_sales,\n  sum(jul_sales) AS jul_sales,\n  sum(aug_sales) AS aug_sales,\n  sum(sep_sales) AS sep_sales,\n  sum(oct_sales) AS oct_sales,\n  sum(nov_sales) AS nov_sales,\n  sum(dec_sales) AS dec_sales,\n  sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,\n  sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,\n  sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,\n  sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,\n  sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,\n  sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,\n  sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,\n  sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,\n  sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,\n  sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,\n  sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,\n  sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,\n  sum(jan_net) AS jan_net,\n  sum(feb_net) AS feb_net,\n  sum(mar_net) AS mar_net,\n  sum(apr_net) AS apr_net,\n  sum(may_net) AS may_net,\n  sum(jun_net) AS jun_net,\n  sum(jul_net) AS jul_net,\n  sum(aug_net) AS aug_net,\n  sum(sep_net) AS sep_net,\n  sum(oct_net) AS oct_net,\n  sum(nov_net) AS nov_net,\n  sum(dec_net) AS dec_net\nFROM (\n       (SELECT\n         w_warehouse_name,\n         w_warehouse_sq_ft,\n         w_city,\n         w_county,\n         w_state,\n         w_country,\n         concat('DHL', ',', 'BARIAN') AS ship_carriers,\n         d_year AS year,\n         sum(CASE WHEN d_moy = 1\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS jan_sales,\n         sum(CASE WHEN d_moy = 2\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS feb_sales,\n         sum(CASE WHEN d_moy = 3\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS mar_sales,\n         sum(CASE WHEN d_moy = 4\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS apr_sales,\n         sum(CASE WHEN d_moy = 5\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS may_sales,\n         sum(CASE WHEN d_moy = 6\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS jun_sales,\n         sum(CASE WHEN d_moy = 7\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS jul_sales,\n         sum(CASE WHEN d_moy = 8\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS aug_sales,\n         sum(CASE WHEN d_moy = 9\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS sep_sales,\n         sum(CASE WHEN d_moy = 10\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS oct_sales,\n         sum(CASE WHEN d_moy = 11\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS nov_sales,\n         sum(CASE WHEN d_moy = 12\n           THEN ws_ext_sales_price * ws_quantity\n             ELSE 0 END) AS dec_sales,\n         sum(CASE WHEN d_moy = 1\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS jan_net,\n         sum(CASE WHEN d_moy = 2\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS feb_net,\n         sum(CASE WHEN d_moy = 3\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS mar_net,\n         sum(CASE WHEN d_moy = 4\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS apr_net,\n         sum(CASE WHEN d_moy = 5\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS may_net,\n         sum(CASE WHEN d_moy = 6\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS jun_net,\n         sum(CASE WHEN d_moy = 7\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS jul_net,\n         sum(CASE WHEN d_moy = 8\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS aug_net,\n         sum(CASE WHEN d_moy = 9\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS sep_net,\n         sum(CASE WHEN d_moy = 10\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS oct_net,\n         sum(CASE WHEN d_moy = 11\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS nov_net,\n         sum(CASE WHEN d_moy = 12\n           THEN ws_net_paid * ws_quantity\n             ELSE 0 END) AS dec_net\n       FROM\n         web_sales, warehouse, date_dim, time_dim, ship_mode\n       WHERE\n         ws_warehouse_sk = w_warehouse_sk\n           AND ws_sold_date_sk = d_date_sk\n           AND ws_sold_time_sk = t_time_sk\n           AND ws_ship_mode_sk = sm_ship_mode_sk\n           AND d_year = 2001\n           AND t_time BETWEEN 30838 AND 30838 + 28800\n           AND sm_carrier IN ('DHL', 'BARIAN')\n       GROUP BY\n         w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year)\n       UNION ALL\n       (SELECT\n         w_warehouse_name,\n         w_warehouse_sq_ft,\n         w_city,\n         w_county,\n         w_state,\n         w_country,\n         concat('DHL', ',', 'BARIAN') AS ship_carriers,\n         d_year AS year,\n         sum(CASE WHEN d_moy = 1\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS jan_sales,\n         sum(CASE WHEN d_moy = 2\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS feb_sales,\n         sum(CASE WHEN d_moy = 3\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS mar_sales,\n         sum(CASE WHEN d_moy = 4\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS apr_sales,\n         sum(CASE WHEN d_moy = 5\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS may_sales,\n         sum(CASE WHEN d_moy = 6\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS jun_sales,\n         sum(CASE WHEN d_moy = 7\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS jul_sales,\n         sum(CASE WHEN d_moy = 8\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS aug_sales,\n         sum(CASE WHEN d_moy = 9\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS sep_sales,\n         sum(CASE WHEN d_moy = 10\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS oct_sales,\n         sum(CASE WHEN d_moy = 11\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS nov_sales,\n         sum(CASE WHEN d_moy = 12\n           THEN cs_sales_price * cs_quantity\n             ELSE 0 END) AS dec_sales,\n         sum(CASE WHEN d_moy = 1\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS jan_net,\n         sum(CASE WHEN d_moy = 2\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS feb_net,\n         sum(CASE WHEN d_moy = 3\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS mar_net,\n         sum(CASE WHEN d_moy = 4\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS apr_net,\n         sum(CASE WHEN d_moy = 5\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS may_net,\n         sum(CASE WHEN d_moy = 6\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS jun_net,\n         sum(CASE WHEN d_moy = 7\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS jul_net,\n         sum(CASE WHEN d_moy = 8\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS aug_net,\n         sum(CASE WHEN d_moy = 9\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS sep_net,\n         sum(CASE WHEN d_moy = 10\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS oct_net,\n         sum(CASE WHEN d_moy = 11\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS nov_net,\n         sum(CASE WHEN d_moy = 12\n           THEN cs_net_paid_inc_tax * cs_quantity\n             ELSE 0 END) AS dec_net\n       FROM\n         catalog_sales, warehouse, date_dim, time_dim, ship_mode\n       WHERE\n         cs_warehouse_sk = w_warehouse_sk\n           AND cs_sold_date_sk = d_date_sk\n           AND cs_sold_time_sk = t_time_sk\n           AND cs_ship_mode_sk = sm_ship_mode_sk\n           AND d_year = 2001\n           AND t_time BETWEEN 30838 AND 30838 + 28800\n           AND sm_carrier IN ('DHL', 'BARIAN')\n       GROUP BY\n         w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year\n       )\n     ) x\nGROUP BY\n  w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country,\n  ship_carriers, year\nORDER BY w_warehouse_name\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q67.sql",
    "content": "SELECT *\nFROM\n  (SELECT\n    i_category,\n    i_class,\n    i_brand,\n    i_product_name,\n    d_year,\n    d_qoy,\n    d_moy,\n    s_store_id,\n    sumsales,\n    rank()\n    OVER (PARTITION BY i_category\n      ORDER BY sumsales DESC) rk\n  FROM\n    (SELECT\n      i_category,\n      i_class,\n      i_brand,\n      i_product_name,\n      d_year,\n      d_qoy,\n      d_moy,\n      s_store_id,\n      sum(coalesce(ss_sales_price * ss_quantity, 0)) sumsales\n    FROM store_sales, date_dim, store, item\n    WHERE ss_sold_date_sk = d_date_sk\n      AND ss_item_sk = i_item_sk\n      AND ss_store_sk = s_store_sk\n      AND d_month_seq BETWEEN 1200 AND 1200 + 11\n    GROUP BY ROLLUP (i_category, i_class, i_brand, i_product_name, d_year, d_qoy,\n      d_moy, s_store_id)) dw1) dw2\nWHERE rk <= 100\nORDER BY\n  i_category, i_class, i_brand, i_product_name, d_year,\n  d_qoy, d_moy, s_store_id, sumsales, rk\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q68.sql",
    "content": "SELECT\n  c_last_name,\n  c_first_name,\n  ca_city,\n  bought_city,\n  ss_ticket_number,\n  extended_price,\n  extended_tax,\n  list_price\nFROM (SELECT\n  ss_ticket_number,\n  ss_customer_sk,\n  ca_city bought_city,\n  sum(ss_ext_sales_price) extended_price,\n  sum(ss_ext_list_price) list_price,\n  sum(ss_ext_tax) extended_tax\nFROM store_sales, date_dim, store, household_demographics, customer_address\nWHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n  AND store_sales.ss_store_sk = store.s_store_sk\n  AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n  AND store_sales.ss_addr_sk = customer_address.ca_address_sk\n  AND date_dim.d_dom BETWEEN 1 AND 2\n  AND (household_demographics.hd_dep_count = 4 OR\n  household_demographics.hd_vehicle_count = 3)\n  AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2)\n  AND store.s_city IN ('Midway', 'Fairview')\nGROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn,\n  customer,\n  customer_address current_addr\nWHERE ss_customer_sk = c_customer_sk\n  AND customer.c_current_addr_sk = current_addr.ca_address_sk\n  AND current_addr.ca_city <> bought_city\nORDER BY c_last_name, ss_ticket_number\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q69.sql",
    "content": "SELECT\n  cd_gender,\n  cd_marital_status,\n  cd_education_status,\n  count(*) cnt1,\n  cd_purchase_estimate,\n  count(*) cnt2,\n  cd_credit_rating,\n  count(*) cnt3\nFROM\n  customer c, customer_address ca, customer_demographics\nWHERE\n  c.c_current_addr_sk = ca.ca_address_sk AND\n    ca_state IN ('KY', 'GA', 'NM') AND\n    cd_demo_sk = c.c_current_cdemo_sk AND\n    exists(SELECT *\n           FROM store_sales, date_dim\n           WHERE c.c_customer_sk = ss_customer_sk AND\n             ss_sold_date_sk = d_date_sk AND\n             d_year = 2001 AND\n             d_moy BETWEEN 4 AND 4 + 2) AND\n    (NOT exists(SELECT *\n                FROM web_sales, date_dim\n                WHERE c.c_customer_sk = ws_bill_customer_sk AND\n                  ws_sold_date_sk = d_date_sk AND\n                  d_year = 2001 AND\n                  d_moy BETWEEN 4 AND 4 + 2) AND\n      NOT exists(SELECT *\n                 FROM catalog_sales, date_dim\n                 WHERE c.c_customer_sk = cs_ship_customer_sk AND\n                   cs_sold_date_sk = d_date_sk AND\n                   d_year = 2001 AND\n                   d_moy BETWEEN 4 AND 4 + 2))\nGROUP BY cd_gender, cd_marital_status, cd_education_status,\n  cd_purchase_estimate, cd_credit_rating\nORDER BY cd_gender, cd_marital_status, cd_education_status,\n  cd_purchase_estimate, cd_credit_rating\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q7.sql",
    "content": "SELECT\n  i_item_id,\n  avg(ss_quantity) agg1,\n  avg(ss_list_price) agg2,\n  avg(ss_coupon_amt) agg3,\n  avg(ss_sales_price) agg4\nFROM store_sales, customer_demographics, date_dim, item, promotion\nWHERE ss_sold_date_sk = d_date_sk AND\n  ss_item_sk = i_item_sk AND\n  ss_cdemo_sk = cd_demo_sk AND\n  ss_promo_sk = p_promo_sk AND\n  cd_gender = 'M' AND\n  cd_marital_status = 'S' AND\n  cd_education_status = 'College' AND\n  (p_channel_email = 'N' OR p_channel_event = 'N') AND\n  d_year = 2000\nGROUP BY i_item_id\nORDER BY i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q70.sql",
    "content": "SELECT\n  sum(ss_net_profit) AS total_sum,\n  s_state,\n  s_county,\n  grouping(s_state) + grouping(s_county) AS lochierarchy,\n  rank()\n  OVER (\n    PARTITION BY grouping(s_state) + grouping(s_county),\n      CASE WHEN grouping(s_county) = 0\n        THEN s_state END\n    ORDER BY sum(ss_net_profit) DESC) AS rank_within_parent\nFROM\n  store_sales, date_dim d1, store\nWHERE\n  d1.d_month_seq BETWEEN 1200 AND 1200 + 11\n    AND d1.d_date_sk = ss_sold_date_sk\n    AND s_store_sk = ss_store_sk\n    AND s_state IN\n    (SELECT s_state\n    FROM\n      (SELECT\n        s_state AS s_state,\n        rank()\n        OVER (PARTITION BY s_state\n          ORDER BY sum(ss_net_profit) DESC) AS ranking\n      FROM store_sales, store, date_dim\n      WHERE d_month_seq BETWEEN 1200 AND 1200 + 11\n        AND d_date_sk = ss_sold_date_sk\n        AND s_store_sk = ss_store_sk\n      GROUP BY s_state) tmp1\n    WHERE ranking <= 5)\nGROUP BY ROLLUP (s_state, s_county)\nORDER BY\n  lochierarchy DESC\n  , CASE WHEN lochierarchy = 0\n  THEN s_state END\n  , rank_within_parent\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q71.sql",
    "content": "SELECT\n  i_brand_id brand_id,\n  i_brand brand,\n  t_hour,\n  t_minute,\n  sum(ext_price) ext_price\nFROM item,\n  (SELECT\n     ws_ext_sales_price AS ext_price,\n     ws_sold_date_sk AS sold_date_sk,\n     ws_item_sk AS sold_item_sk,\n     ws_sold_time_sk AS time_sk\n   FROM web_sales, date_dim\n   WHERE d_date_sk = ws_sold_date_sk\n     AND d_moy = 11\n     AND d_year = 1999\n   UNION ALL\n   SELECT\n     cs_ext_sales_price AS ext_price,\n     cs_sold_date_sk AS sold_date_sk,\n     cs_item_sk AS sold_item_sk,\n     cs_sold_time_sk AS time_sk\n   FROM catalog_sales, date_dim\n   WHERE d_date_sk = cs_sold_date_sk\n     AND d_moy = 11\n     AND d_year = 1999\n   UNION ALL\n   SELECT\n     ss_ext_sales_price AS ext_price,\n     ss_sold_date_sk AS sold_date_sk,\n     ss_item_sk AS sold_item_sk,\n     ss_sold_time_sk AS time_sk\n   FROM store_sales, date_dim\n   WHERE d_date_sk = ss_sold_date_sk\n     AND d_moy = 11\n     AND d_year = 1999\n  ) AS tmp, time_dim\nWHERE\n  sold_item_sk = i_item_sk\n    AND i_manager_id = 1\n    AND time_sk = t_time_sk\n    AND (t_meal_time = 'breakfast' OR t_meal_time = 'dinner')\nGROUP BY i_brand, i_brand_id, t_hour, t_minute\nORDER BY ext_price DESC, brand_id\n"
  },
  {
    "path": "spark-queries-tpcds/q72.sql",
    "content": "SELECT\n  i_item_desc,\n  w_warehouse_name,\n  d1.d_week_seq,\n  count(CASE WHEN p_promo_sk IS NULL\n    THEN 1\n        ELSE 0 END) no_promo,\n  count(CASE WHEN p_promo_sk IS NOT NULL\n    THEN 1\n        ELSE 0 END) promo,\n  count(*) total_cnt\nFROM catalog_sales\n  JOIN inventory ON (cs_item_sk = inv_item_sk)\n  JOIN warehouse ON (w_warehouse_sk = inv_warehouse_sk)\n  JOIN item ON (i_item_sk = cs_item_sk)\n  JOIN customer_demographics ON (cs_bill_cdemo_sk = cd_demo_sk)\n  JOIN household_demographics ON (cs_bill_hdemo_sk = hd_demo_sk)\n  JOIN date_dim d1 ON (cs_sold_date_sk = d1.d_date_sk)\n  JOIN date_dim d2 ON (inv_date_sk = d2.d_date_sk)\n  JOIN date_dim d3 ON (cs_ship_date_sk = d3.d_date_sk)\n  LEFT OUTER JOIN promotion ON (cs_promo_sk = p_promo_sk)\n  LEFT OUTER JOIN catalog_returns ON (cr_item_sk = cs_item_sk AND cr_order_number = cs_order_number)\nWHERE d1.d_week_seq = d2.d_week_seq\n  AND inv_quantity_on_hand < cs_quantity\n  AND d3.d_date > (cast(d1.d_date AS DATE) + interval 5 days)\n  AND hd_buy_potential = '>10000'\n  AND d1.d_year = 1999\n  AND hd_buy_potential = '>10000'\n  AND cd_marital_status = 'D'\n  AND d1.d_year = 1999\nGROUP BY i_item_desc, w_warehouse_name, d1.d_week_seq\nORDER BY total_cnt DESC, i_item_desc, w_warehouse_name, d_week_seq\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q73.sql",
    "content": "SELECT\n  c_last_name,\n  c_first_name,\n  c_salutation,\n  c_preferred_cust_flag,\n  ss_ticket_number,\n  cnt\nFROM\n  (SELECT\n    ss_ticket_number,\n    ss_customer_sk,\n    count(*) cnt\n  FROM store_sales, date_dim, store, household_demographics\n  WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    AND store_sales.ss_store_sk = store.s_store_sk\n    AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND date_dim.d_dom BETWEEN 1 AND 2\n    AND (household_demographics.hd_buy_potential = '>10000' OR\n    household_demographics.hd_buy_potential = 'unknown')\n    AND household_demographics.hd_vehicle_count > 0\n    AND CASE WHEN household_demographics.hd_vehicle_count > 0\n    THEN\n      household_demographics.hd_dep_count / household_demographics.hd_vehicle_count\n        ELSE NULL END > 1\n    AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2)\n    AND store.s_county IN ('Williamson County', 'Franklin Parish', 'Bronx County', 'Orange County')\n  GROUP BY ss_ticket_number, ss_customer_sk) dj, customer\nWHERE ss_customer_sk = c_customer_sk\n  AND cnt BETWEEN 1 AND 5\nORDER BY cnt DESC\n"
  },
  {
    "path": "spark-queries-tpcds/q74.sql",
    "content": "WITH year_total AS (\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    d_year AS year,\n    sum(ss_net_paid) year_total,\n    's' sale_type\n  FROM\n    customer, store_sales, date_dim\n  WHERE c_customer_sk = ss_customer_sk\n    AND ss_sold_date_sk = d_date_sk\n    AND d_year IN (2001, 2001 + 1)\n  GROUP BY\n    c_customer_id, c_first_name, c_last_name, d_year\n  UNION ALL\n  SELECT\n    c_customer_id customer_id,\n    c_first_name customer_first_name,\n    c_last_name customer_last_name,\n    d_year AS year,\n    sum(ws_net_paid) year_total,\n    'w' sale_type\n  FROM\n    customer, web_sales, date_dim\n  WHERE c_customer_sk = ws_bill_customer_sk\n    AND ws_sold_date_sk = d_date_sk\n    AND d_year IN (2001, 2001 + 1)\n  GROUP BY\n    c_customer_id, c_first_name, c_last_name, d_year)\nSELECT\n  t_s_secyear.customer_id,\n  t_s_secyear.customer_first_name,\n  t_s_secyear.customer_last_name\nFROM\n  year_total t_s_firstyear, year_total t_s_secyear,\n  year_total t_w_firstyear, year_total t_w_secyear\nWHERE t_s_secyear.customer_id = t_s_firstyear.customer_id\n  AND t_s_firstyear.customer_id = t_w_secyear.customer_id\n  AND t_s_firstyear.customer_id = t_w_firstyear.customer_id\n  AND t_s_firstyear.sale_type = 's'\n  AND t_w_firstyear.sale_type = 'w'\n  AND t_s_secyear.sale_type = 's'\n  AND t_w_secyear.sale_type = 'w'\n  AND t_s_firstyear.year = 2001\n  AND t_s_secyear.year = 2001 + 1\n  AND t_w_firstyear.year = 2001\n  AND t_w_secyear.year = 2001 + 1\n  AND t_s_firstyear.year_total > 0\n  AND t_w_firstyear.year_total > 0\n  AND CASE WHEN t_w_firstyear.year_total > 0\n  THEN t_w_secyear.year_total / t_w_firstyear.year_total\n      ELSE NULL END\n  > CASE WHEN t_s_firstyear.year_total > 0\n  THEN t_s_secyear.year_total / t_s_firstyear.year_total\n    ELSE NULL END\nORDER BY 1, 1, 1\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q75.sql",
    "content": "WITH all_sales AS (\n  SELECT\n    d_year,\n    i_brand_id,\n    i_class_id,\n    i_category_id,\n    i_manufact_id,\n    SUM(sales_cnt) AS sales_cnt,\n    SUM(sales_amt) AS sales_amt\n  FROM (\n         SELECT\n           d_year,\n           i_brand_id,\n           i_class_id,\n           i_category_id,\n           i_manufact_id,\n           cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt,\n           cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt\n         FROM catalog_sales\n           JOIN item ON i_item_sk = cs_item_sk\n           JOIN date_dim ON d_date_sk = cs_sold_date_sk\n           LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number\n             AND cs_item_sk = cr_item_sk)\n         WHERE i_category = 'Books'\n         UNION\n         SELECT\n           d_year,\n           i_brand_id,\n           i_class_id,\n           i_category_id,\n           i_manufact_id,\n           ss_quantity - COALESCE(sr_return_quantity, 0) AS sales_cnt,\n           ss_ext_sales_price - COALESCE(sr_return_amt, 0.0) AS sales_amt\n         FROM store_sales\n           JOIN item ON i_item_sk = ss_item_sk\n           JOIN date_dim ON d_date_sk = ss_sold_date_sk\n           LEFT JOIN store_returns ON (ss_ticket_number = sr_ticket_number\n             AND ss_item_sk = sr_item_sk)\n         WHERE i_category = 'Books'\n         UNION\n         SELECT\n           d_year,\n           i_brand_id,\n           i_class_id,\n           i_category_id,\n           i_manufact_id,\n           ws_quantity - COALESCE(wr_return_quantity, 0) AS sales_cnt,\n           ws_ext_sales_price - COALESCE(wr_return_amt, 0.0) AS sales_amt\n         FROM web_sales\n           JOIN item ON i_item_sk = ws_item_sk\n           JOIN date_dim ON d_date_sk = ws_sold_date_sk\n           LEFT JOIN web_returns ON (ws_order_number = wr_order_number\n             AND ws_item_sk = wr_item_sk)\n         WHERE i_category = 'Books') sales_detail\n  GROUP BY d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id)\nSELECT\n  prev_yr.d_year AS prev_year,\n  curr_yr.d_year AS year,\n  curr_yr.i_brand_id,\n  curr_yr.i_class_id,\n  curr_yr.i_category_id,\n  curr_yr.i_manufact_id,\n  prev_yr.sales_cnt AS prev_yr_cnt,\n  curr_yr.sales_cnt AS curr_yr_cnt,\n  curr_yr.sales_cnt - prev_yr.sales_cnt AS sales_cnt_diff,\n  curr_yr.sales_amt - prev_yr.sales_amt AS sales_amt_diff\nFROM all_sales curr_yr, all_sales prev_yr\nWHERE curr_yr.i_brand_id = prev_yr.i_brand_id\n  AND curr_yr.i_class_id = prev_yr.i_class_id\n  AND curr_yr.i_category_id = prev_yr.i_category_id\n  AND curr_yr.i_manufact_id = prev_yr.i_manufact_id\n  AND curr_yr.d_year = 2002\n  AND prev_yr.d_year = 2002 - 1\n  AND CAST(curr_yr.sales_cnt AS DECIMAL(17, 2)) / CAST(prev_yr.sales_cnt AS DECIMAL(17, 2)) < 0.9\nORDER BY sales_cnt_diff\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q76.sql",
    "content": "SELECT\n  channel,\n  col_name,\n  d_year,\n  d_qoy,\n  i_category,\n  COUNT(*) sales_cnt,\n  SUM(ext_sales_price) sales_amt\nFROM (\n       SELECT\n         'store' AS channel,\n         ss_store_sk col_name,\n         d_year,\n         d_qoy,\n         i_category,\n         ss_ext_sales_price ext_sales_price\n       FROM store_sales, item, date_dim\n       WHERE ss_store_sk IS NULL\n         AND ss_sold_date_sk = d_date_sk\n         AND ss_item_sk = i_item_sk\n       UNION ALL\n       SELECT\n         'web' AS channel,\n         ws_ship_customer_sk col_name,\n         d_year,\n         d_qoy,\n         i_category,\n         ws_ext_sales_price ext_sales_price\n       FROM web_sales, item, date_dim\n       WHERE ws_ship_customer_sk IS NULL\n         AND ws_sold_date_sk = d_date_sk\n         AND ws_item_sk = i_item_sk\n       UNION ALL\n       SELECT\n         'catalog' AS channel,\n         cs_ship_addr_sk col_name,\n         d_year,\n         d_qoy,\n         i_category,\n         cs_ext_sales_price ext_sales_price\n       FROM catalog_sales, item, date_dim\n       WHERE cs_ship_addr_sk IS NULL\n         AND cs_sold_date_sk = d_date_sk\n         AND cs_item_sk = i_item_sk) foo\nGROUP BY channel, col_name, d_year, d_qoy, i_category\nORDER BY channel, col_name, d_year, d_qoy, i_category\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q77.sql",
    "content": "WITH ss AS\n(SELECT\n    s_store_sk,\n    sum(ss_ext_sales_price) AS sales,\n    sum(ss_net_profit) AS profit\n  FROM store_sales, date_dim, store\n  WHERE ss_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-03' AS DATE) AND\n  (cast('2000-08-03' AS DATE) + INTERVAL 30 days)\n    AND ss_store_sk = s_store_sk\n  GROUP BY s_store_sk),\n    sr AS\n  (SELECT\n    s_store_sk,\n    sum(sr_return_amt) AS returns,\n    sum(sr_net_loss) AS profit_loss\n  FROM store_returns, date_dim, store\n  WHERE sr_returned_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-03' AS DATE) AND\n  (cast('2000-08-03' AS DATE) + INTERVAL 30 days)\n    AND sr_store_sk = s_store_sk\n  GROUP BY s_store_sk),\n    cs AS\n  (SELECT\n    cs_call_center_sk,\n    sum(cs_ext_sales_price) AS sales,\n    sum(cs_net_profit) AS profit\n  FROM catalog_sales, date_dim\n  WHERE cs_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-03' AS DATE) AND\n  (cast('2000-08-03' AS DATE) + INTERVAL 30 days)\n  GROUP BY cs_call_center_sk),\n    cr AS\n  (SELECT\n    sum(cr_return_amount) AS returns,\n    sum(cr_net_loss) AS profit_loss\n  FROM catalog_returns, date_dim\n  WHERE cr_returned_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-03' AS DATE) AND\n  (cast('2000-08-03' AS DATE) + INTERVAL 30 days)),\n    ws AS\n  (SELECT\n    wp_web_page_sk,\n    sum(ws_ext_sales_price) AS sales,\n    sum(ws_net_profit) AS profit\n  FROM web_sales, date_dim, web_page\n  WHERE ws_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-03' AS DATE) AND\n  (cast('2000-08-03' AS DATE) + INTERVAL 30 days)\n    AND ws_web_page_sk = wp_web_page_sk\n  GROUP BY wp_web_page_sk),\n    wr AS\n  (SELECT\n    wp_web_page_sk,\n    sum(wr_return_amt) AS returns,\n    sum(wr_net_loss) AS profit_loss\n  FROM web_returns, date_dim, web_page\n  WHERE wr_returned_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-03' AS DATE) AND\n  (cast('2000-08-03' AS DATE) + INTERVAL 30 days)\n    AND wr_web_page_sk = wp_web_page_sk\n  GROUP BY wp_web_page_sk)\nSELECT\n  channel,\n  id,\n  sum(sales) AS sales,\n  sum(returns) AS returns,\n  sum(profit) AS profit\nFROM\n  (SELECT\n     'store channel' AS channel,\n     ss.s_store_sk AS id,\n     sales,\n     coalesce(returns, 0) AS returns,\n     (profit - coalesce(profit_loss, 0)) AS profit\n   FROM ss\n     LEFT JOIN sr\n       ON ss.s_store_sk = sr.s_store_sk\n   UNION ALL\n   SELECT\n     'catalog channel' AS channel,\n     cs_call_center_sk AS id,\n     sales,\n     returns,\n     (profit - profit_loss) AS profit\n   FROM cs, cr\n   UNION ALL\n   SELECT\n     'web channel' AS channel,\n     ws.wp_web_page_sk AS id,\n     sales,\n     coalesce(returns, 0) returns,\n     (profit - coalesce(profit_loss, 0)) AS profit\n   FROM ws\n     LEFT JOIN wr\n       ON ws.wp_web_page_sk = wr.wp_web_page_sk\n  ) x\nGROUP BY ROLLUP (channel, id)\nORDER BY channel, id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q78.sql",
    "content": "WITH ws AS\n(SELECT\n    d_year AS ws_sold_year,\n    ws_item_sk,\n    ws_bill_customer_sk ws_customer_sk,\n    sum(ws_quantity) ws_qty,\n    sum(ws_wholesale_cost) ws_wc,\n    sum(ws_sales_price) ws_sp\n  FROM web_sales\n    LEFT JOIN web_returns ON wr_order_number = ws_order_number AND ws_item_sk = wr_item_sk\n    JOIN date_dim ON ws_sold_date_sk = d_date_sk\n  WHERE wr_order_number IS NULL\n  GROUP BY d_year, ws_item_sk, ws_bill_customer_sk\n),\n    cs AS\n  (SELECT\n    d_year AS cs_sold_year,\n    cs_item_sk,\n    cs_bill_customer_sk cs_customer_sk,\n    sum(cs_quantity) cs_qty,\n    sum(cs_wholesale_cost) cs_wc,\n    sum(cs_sales_price) cs_sp\n  FROM catalog_sales\n    LEFT JOIN catalog_returns ON cr_order_number = cs_order_number AND cs_item_sk = cr_item_sk\n    JOIN date_dim ON cs_sold_date_sk = d_date_sk\n  WHERE cr_order_number IS NULL\n  GROUP BY d_year, cs_item_sk, cs_bill_customer_sk\n  ),\n    ss AS\n  (SELECT\n    d_year AS ss_sold_year,\n    ss_item_sk,\n    ss_customer_sk,\n    sum(ss_quantity) ss_qty,\n    sum(ss_wholesale_cost) ss_wc,\n    sum(ss_sales_price) ss_sp\n  FROM store_sales\n    LEFT JOIN store_returns ON sr_ticket_number = ss_ticket_number AND ss_item_sk = sr_item_sk\n    JOIN date_dim ON ss_sold_date_sk = d_date_sk\n  WHERE sr_ticket_number IS NULL\n  GROUP BY d_year, ss_item_sk, ss_customer_sk\n  )\nSELECT\n  round(ss_qty / (coalesce(ws_qty + cs_qty, 1)), 2) ratio,\n  ss_qty store_qty,\n  ss_wc store_wholesale_cost,\n  ss_sp store_sales_price,\n  coalesce(ws_qty, 0) + coalesce(cs_qty, 0) other_chan_qty,\n  coalesce(ws_wc, 0) + coalesce(cs_wc, 0) other_chan_wholesale_cost,\n  coalesce(ws_sp, 0) + coalesce(cs_sp, 0) other_chan_sales_price\nFROM ss\n  LEFT JOIN ws\n    ON (ws_sold_year = ss_sold_year AND ws_item_sk = ss_item_sk AND ws_customer_sk = ss_customer_sk)\n  LEFT JOIN cs\n    ON (cs_sold_year = ss_sold_year AND cs_item_sk = ss_item_sk AND cs_customer_sk = ss_customer_sk)\nWHERE coalesce(ws_qty, 0) > 0 AND coalesce(cs_qty, 0) > 0 AND ss_sold_year = 2000\nORDER BY\n  ratio,\n  ss_qty DESC, ss_wc DESC, ss_sp DESC,\n  other_chan_qty,\n  other_chan_wholesale_cost,\n  other_chan_sales_price,\n  round(ss_qty / (coalesce(ws_qty + cs_qty, 1)), 2)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q79.sql",
    "content": "SELECT\n  c_last_name,\n  c_first_name,\n  substr(s_city, 1, 30),\n  ss_ticket_number,\n  amt,\n  profit\nFROM\n  (SELECT\n    ss_ticket_number,\n    ss_customer_sk,\n    store.s_city,\n    sum(ss_coupon_amt) amt,\n    sum(ss_net_profit) profit\n  FROM store_sales, date_dim, store, household_demographics\n  WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n    AND store_sales.ss_store_sk = store.s_store_sk\n    AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND (household_demographics.hd_dep_count = 6 OR\n    household_demographics.hd_vehicle_count > 2)\n    AND date_dim.d_dow = 1\n    AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2)\n    AND store.s_number_employees BETWEEN 200 AND 295\n  GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, store.s_city) ms, customer\nWHERE ss_customer_sk = c_customer_sk\nORDER BY c_last_name, c_first_name, substr(s_city, 1, 30), profit\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q8.sql",
    "content": "SELECT\n  s_store_name,\n  sum(ss_net_profit)\nFROM store_sales, date_dim, store,\n  (SELECT ca_zip\n  FROM (\n         (SELECT substr(ca_zip, 1, 5) ca_zip\n         FROM customer_address\n         WHERE substr(ca_zip, 1, 5) IN (\n               '24128','76232','65084','87816','83926','77556','20548',\n               '26231','43848','15126','91137','61265','98294','25782',\n               '17920','18426','98235','40081','84093','28577','55565',\n               '17183','54601','67897','22752','86284','18376','38607',\n               '45200','21756','29741','96765','23932','89360','29839',\n               '25989','28898','91068','72550','10390','18845','47770',\n               '82636','41367','76638','86198','81312','37126','39192',\n               '88424','72175','81426','53672','10445','42666','66864',\n               '66708','41248','48583','82276','18842','78890','49448',\n               '14089','38122','34425','79077','19849','43285','39861',\n               '66162','77610','13695','99543','83444','83041','12305',\n               '57665','68341','25003','57834','62878','49130','81096',\n               '18840','27700','23470','50412','21195','16021','76107',\n               '71954','68309','18119','98359','64544','10336','86379',\n               '27068','39736','98569','28915','24206','56529','57647',\n               '54917','42961','91110','63981','14922','36420','23006',\n               '67467','32754','30903','20260','31671','51798','72325',\n               '85816','68621','13955','36446','41766','68806','16725',\n               '15146','22744','35850','88086','51649','18270','52867',\n               '39972','96976','63792','11376','94898','13595','10516',\n               '90225','58943','39371','94945','28587','96576','57855',\n               '28488','26105','83933','25858','34322','44438','73171',\n               '30122','34102','22685','71256','78451','54364','13354',\n               '45375','40558','56458','28286','45266','47305','69399',\n               '83921','26233','11101','15371','69913','35942','15882',\n               '25631','24610','44165','99076','33786','70738','26653',\n               '14328','72305','62496','22152','10144','64147','48425',\n               '14663','21076','18799','30450','63089','81019','68893',\n               '24996','51200','51211','45692','92712','70466','79994',\n               '22437','25280','38935','71791','73134','56571','14060',\n               '19505','72425','56575','74351','68786','51650','20004',\n               '18383','76614','11634','18906','15765','41368','73241',\n               '76698','78567','97189','28545','76231','75691','22246',\n               '51061','90578','56691','68014','51103','94167','57047',\n               '14867','73520','15734','63435','25733','35474','24676',\n               '94627','53535','17879','15559','53268','59166','11928',\n               '59402','33282','45721','43933','68101','33515','36634',\n               '71286','19736','58058','55253','67473','41918','19515',\n               '36495','19430','22351','77191','91393','49156','50298',\n               '87501','18652','53179','18767','63193','23968','65164',\n               '68880','21286','72823','58470','67301','13394','31016',\n               '70372','67030','40604','24317','45748','39127','26065',\n               '77721','31029','31880','60576','24671','45549','13376',\n               '50016','33123','19769','22927','97789','46081','72151',\n               '15723','46136','51949','68100','96888','64528','14171',\n               '79777','28709','11489','25103','32213','78668','22245',\n               '15798','27156','37930','62971','21337','51622','67853',\n               '10567','38415','15455','58263','42029','60279','37125',\n               '56240','88190','50308','26859','64457','89091','82136',\n               '62377','36233','63837','58078','17043','30010','60099',\n               '28810','98025','29178','87343','73273','30469','64034',\n               '39516','86057','21309','90257','67875','40162','11356',\n               '73650','61810','72013','30431','22461','19512','13375',\n               '55307','30625','83849','68908','26689','96451','38193',\n               '46820','88885','84935','69035','83144','47537','56616',\n               '94983','48033','69952','25486','61547','27385','61860',\n               '58048','56910','16807','17871','35258','31387','35458',\n               '35576'))\n         INTERSECT\n         (SELECT ca_zip\n         FROM\n           (SELECT\n             substr(ca_zip, 1, 5) ca_zip,\n             count(*) cnt\n           FROM customer_address, customer\n           WHERE ca_address_sk = c_current_addr_sk AND\n             c_preferred_cust_flag = 'Y'\n           GROUP BY ca_zip\n           HAVING count(*) > 10) A1)\n       ) A2\n  ) V1\nWHERE ss_store_sk = s_store_sk\n  AND ss_sold_date_sk = d_date_sk\n  AND d_qoy = 2 AND d_year = 1998\n  AND (substr(s_zip, 1, 2) = substr(V1.ca_zip, 1, 2))\nGROUP BY s_store_name\nORDER BY s_store_name\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q80.sql",
    "content": "WITH ssr AS\n(SELECT\n    s_store_id AS store_id,\n    sum(ss_ext_sales_price) AS sales,\n    sum(coalesce(sr_return_amt, 0)) AS returns,\n    sum(ss_net_profit - coalesce(sr_net_loss, 0)) AS profit\n  FROM store_sales\n    LEFT OUTER JOIN store_returns ON\n                                    (ss_item_sk = sr_item_sk AND\n                                      ss_ticket_number = sr_ticket_number)\n    ,\n    date_dim, store, item, promotion\n  WHERE ss_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-23' AS DATE)\n  AND (cast('2000-08-23' AS DATE) + INTERVAL 30 days)\n    AND ss_store_sk = s_store_sk\n    AND ss_item_sk = i_item_sk\n    AND i_current_price > 50\n    AND ss_promo_sk = p_promo_sk\n    AND p_channel_tv = 'N'\n  GROUP BY s_store_id),\n    csr AS\n  (SELECT\n    cp_catalog_page_id AS catalog_page_id,\n    sum(cs_ext_sales_price) AS sales,\n    sum(coalesce(cr_return_amount, 0)) AS returns,\n    sum(cs_net_profit - coalesce(cr_net_loss, 0)) AS profit\n  FROM catalog_sales\n    LEFT OUTER JOIN catalog_returns ON\n                                      (cs_item_sk = cr_item_sk AND\n                                        cs_order_number = cr_order_number)\n    ,\n    date_dim, catalog_page, item, promotion\n  WHERE cs_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-23' AS DATE)\n  AND (cast('2000-08-23' AS DATE) + INTERVAL 30 days)\n    AND cs_catalog_page_sk = cp_catalog_page_sk\n    AND cs_item_sk = i_item_sk\n    AND i_current_price > 50\n    AND cs_promo_sk = p_promo_sk\n    AND p_channel_tv = 'N'\n  GROUP BY cp_catalog_page_id),\n    wsr AS\n  (SELECT\n    web_site_id,\n    sum(ws_ext_sales_price) AS sales,\n    sum(coalesce(wr_return_amt, 0)) AS returns,\n    sum(ws_net_profit - coalesce(wr_net_loss, 0)) AS profit\n  FROM web_sales\n    LEFT OUTER JOIN web_returns ON\n                                  (ws_item_sk = wr_item_sk AND ws_order_number = wr_order_number)\n    ,\n    date_dim, web_site, item, promotion\n  WHERE ws_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('2000-08-23' AS DATE)\n  AND (cast('2000-08-23' AS DATE) + INTERVAL 30 days)\n    AND ws_web_site_sk = web_site_sk\n    AND ws_item_sk = i_item_sk\n    AND i_current_price > 50\n    AND ws_promo_sk = p_promo_sk\n    AND p_channel_tv = 'N'\n  GROUP BY web_site_id)\nSELECT\n  channel,\n  id,\n  sum(sales) AS sales,\n  sum(returns) AS returns,\n  sum(profit) AS profit\nFROM (SELECT\n        'store channel' AS channel,\n        concat('store', store_id) AS id,\n        sales,\n        returns,\n        profit\n      FROM ssr\n      UNION ALL\n      SELECT\n        'catalog channel' AS channel,\n        concat('catalog_page', catalog_page_id) AS id,\n        sales,\n        returns,\n        profit\n      FROM csr\n      UNION ALL\n      SELECT\n        'web channel' AS channel,\n        concat('web_site', web_site_id) AS id,\n        sales,\n        returns,\n        profit\n      FROM wsr) x\nGROUP BY ROLLUP (channel, id)\nORDER BY channel, id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q81.sql",
    "content": "WITH customer_total_return AS\n(SELECT\n    cr_returning_customer_sk AS ctr_customer_sk,\n    ca_state AS ctr_state,\n    sum(cr_return_amt_inc_tax) AS ctr_total_return\n  FROM catalog_returns, date_dim, customer_address\n  WHERE cr_returned_date_sk = d_date_sk\n    AND d_year = 2000\n    AND cr_returning_addr_sk = ca_address_sk\n  GROUP BY cr_returning_customer_sk, ca_state )\nSELECT\n  c_customer_id,\n  c_salutation,\n  c_first_name,\n  c_last_name,\n  ca_street_number,\n  ca_street_name,\n  ca_street_type,\n  ca_suite_number,\n  ca_city,\n  ca_county,\n  ca_state,\n  ca_zip,\n  ca_country,\n  ca_gmt_offset,\n  ca_location_type,\n  ctr_total_return\nFROM customer_total_return ctr1, customer_address, customer\nWHERE ctr1.ctr_total_return > (SELECT avg(ctr_total_return) * 1.2\nFROM customer_total_return ctr2\nWHERE ctr1.ctr_state = ctr2.ctr_state)\n  AND ca_address_sk = c_current_addr_sk\n  AND ca_state = 'GA'\n  AND ctr1.ctr_customer_sk = c_customer_sk\nORDER BY c_customer_id, c_salutation, c_first_name, c_last_name, ca_street_number, ca_street_name\n  , ca_street_type, ca_suite_number, ca_city, ca_county, ca_state, ca_zip, ca_country, ca_gmt_offset\n  , ca_location_type, ctr_total_return\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q82.sql",
    "content": "SELECT\n  i_item_id,\n  i_item_desc,\n  i_current_price\nFROM item, inventory, date_dim, store_sales\nWHERE i_current_price BETWEEN 62 AND 62 + 30\n  AND inv_item_sk = i_item_sk\n  AND d_date_sk = inv_date_sk\n  AND d_date BETWEEN cast('2000-05-25' AS DATE) AND (cast('2000-05-25' AS DATE) + INTERVAL 60 days)\n  AND i_manufact_id IN (129, 270, 821, 423)\n  AND inv_quantity_on_hand BETWEEN 100 AND 500\n  AND ss_item_sk = i_item_sk\nGROUP BY i_item_id, i_item_desc, i_current_price\nORDER BY i_item_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q83.sql",
    "content": "WITH sr_items AS\n(SELECT\n    i_item_id item_id,\n    sum(sr_return_quantity) sr_item_qty\n  FROM store_returns, item, date_dim\n  WHERE sr_item_sk = i_item_sk\n    AND d_date IN (SELECT d_date\n  FROM date_dim\n  WHERE d_week_seq IN\n    (SELECT d_week_seq\n    FROM date_dim\n    WHERE d_date IN ('2000-06-30', '2000-09-27', '2000-11-17')))\n    AND sr_returned_date_sk = d_date_sk\n  GROUP BY i_item_id),\n    cr_items AS\n  (SELECT\n    i_item_id item_id,\n    sum(cr_return_quantity) cr_item_qty\n  FROM catalog_returns, item, date_dim\n  WHERE cr_item_sk = i_item_sk\n    AND d_date IN (SELECT d_date\n  FROM date_dim\n  WHERE d_week_seq IN\n    (SELECT d_week_seq\n    FROM date_dim\n    WHERE d_date IN ('2000-06-30', '2000-09-27', '2000-11-17')))\n    AND cr_returned_date_sk = d_date_sk\n  GROUP BY i_item_id),\n    wr_items AS\n  (SELECT\n    i_item_id item_id,\n    sum(wr_return_quantity) wr_item_qty\n  FROM web_returns, item, date_dim\n  WHERE wr_item_sk = i_item_sk AND d_date IN\n    (SELECT d_date\n    FROM date_dim\n    WHERE d_week_seq IN\n      (SELECT d_week_seq\n      FROM date_dim\n      WHERE d_date IN ('2000-06-30', '2000-09-27', '2000-11-17')))\n    AND wr_returned_date_sk = d_date_sk\n  GROUP BY i_item_id)\nSELECT\n  sr_items.item_id,\n  sr_item_qty,\n  sr_item_qty / (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 * 100 sr_dev,\n  cr_item_qty,\n  cr_item_qty / (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 * 100 cr_dev,\n  wr_item_qty,\n  wr_item_qty / (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 * 100 wr_dev,\n  (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 average\nFROM sr_items, cr_items, wr_items\nWHERE sr_items.item_id = cr_items.item_id\n  AND sr_items.item_id = wr_items.item_id\nORDER BY sr_items.item_id, sr_item_qty\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q84.sql",
    "content": "SELECT\n  c_customer_id AS customer_id,\n  concat(c_last_name, ', ', c_first_name) AS customername\nFROM customer\n  , customer_address\n  , customer_demographics\n  , household_demographics\n  , income_band\n  , store_returns\nWHERE ca_city = 'Edgewood'\n  AND c_current_addr_sk = ca_address_sk\n  AND ib_lower_bound >= 38128\n  AND ib_upper_bound <= 38128 + 50000\n  AND ib_income_band_sk = hd_income_band_sk\n  AND cd_demo_sk = c_current_cdemo_sk\n  AND hd_demo_sk = c_current_hdemo_sk\n  AND sr_cdemo_sk = cd_demo_sk\nORDER BY c_customer_id\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q85.sql",
    "content": "SELECT\n  substr(r_reason_desc, 1, 20),\n  avg(ws_quantity),\n  avg(wr_refunded_cash),\n  avg(wr_fee)\nFROM web_sales, web_returns, web_page, customer_demographics cd1,\n  customer_demographics cd2, customer_address, date_dim, reason\nWHERE ws_web_page_sk = wp_web_page_sk\n  AND ws_item_sk = wr_item_sk\n  AND ws_order_number = wr_order_number\n  AND ws_sold_date_sk = d_date_sk AND d_year = 2000\n  AND cd1.cd_demo_sk = wr_refunded_cdemo_sk\n  AND cd2.cd_demo_sk = wr_returning_cdemo_sk\n  AND ca_address_sk = wr_refunded_addr_sk\n  AND r_reason_sk = wr_reason_sk\n  AND\n  (\n    (\n      cd1.cd_marital_status = 'M'\n        AND\n        cd1.cd_marital_status = cd2.cd_marital_status\n        AND\n        cd1.cd_education_status = 'Advanced Degree'\n        AND\n        cd1.cd_education_status = cd2.cd_education_status\n        AND\n        ws_sales_price BETWEEN 100.00 AND 150.00\n    )\n      OR\n      (\n        cd1.cd_marital_status = 'S'\n          AND\n          cd1.cd_marital_status = cd2.cd_marital_status\n          AND\n          cd1.cd_education_status = 'College'\n          AND\n          cd1.cd_education_status = cd2.cd_education_status\n          AND\n          ws_sales_price BETWEEN 50.00 AND 100.00\n      )\n      OR\n      (\n        cd1.cd_marital_status = 'W'\n          AND\n          cd1.cd_marital_status = cd2.cd_marital_status\n          AND\n          cd1.cd_education_status = '2 yr Degree'\n          AND\n          cd1.cd_education_status = cd2.cd_education_status\n          AND\n          ws_sales_price BETWEEN 150.00 AND 200.00\n      )\n  )\n  AND\n  (\n    (\n      ca_country = 'United States'\n        AND\n        ca_state IN ('IN', 'OH', 'NJ')\n        AND ws_net_profit BETWEEN 100 AND 200\n    )\n      OR\n      (\n        ca_country = 'United States'\n          AND\n          ca_state IN ('WI', 'CT', 'KY')\n          AND ws_net_profit BETWEEN 150 AND 300\n      )\n      OR\n      (\n        ca_country = 'United States'\n          AND\n          ca_state IN ('LA', 'IA', 'AR')\n          AND ws_net_profit BETWEEN 50 AND 250\n      )\n  )\nGROUP BY r_reason_desc\nORDER BY substr(r_reason_desc, 1, 20)\n  , avg(ws_quantity)\n  , avg(wr_refunded_cash)\n  , avg(wr_fee)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q86.sql",
    "content": "SELECT\n  sum(ws_net_paid) AS total_sum,\n  i_category,\n  i_class,\n  grouping(i_category) + grouping(i_class) AS lochierarchy,\n  rank()\n  OVER (\n    PARTITION BY grouping(i_category) + grouping(i_class),\n      CASE WHEN grouping(i_class) = 0\n        THEN i_category END\n    ORDER BY sum(ws_net_paid) DESC) AS rank_within_parent\nFROM\n  web_sales, date_dim d1, item\nWHERE\n  d1.d_month_seq BETWEEN 1200 AND 1200 + 11\n    AND d1.d_date_sk = ws_sold_date_sk\n    AND i_item_sk = ws_item_sk\nGROUP BY ROLLUP (i_category, i_class)\nORDER BY\n  lochierarchy DESC,\n  CASE WHEN lochierarchy = 0\n    THEN i_category END,\n  rank_within_parent\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q87.sql",
    "content": "SELECT count(*)\nFROM ((SELECT DISTINCT\n  c_last_name,\n  c_first_name,\n  d_date\nFROM store_sales, date_dim, customer\nWHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk\n  AND store_sales.ss_customer_sk = customer.c_customer_sk\n  AND d_month_seq BETWEEN 1200 AND 1200 + 11)\n      EXCEPT\n      (SELECT DISTINCT\n        c_last_name,\n        c_first_name,\n        d_date\n      FROM catalog_sales, date_dim, customer\n      WHERE catalog_sales.cs_sold_date_sk = date_dim.d_date_sk\n        AND catalog_sales.cs_bill_customer_sk = customer.c_customer_sk\n        AND d_month_seq BETWEEN 1200 AND 1200 + 11)\n      EXCEPT\n      (SELECT DISTINCT\n        c_last_name,\n        c_first_name,\n        d_date\n      FROM web_sales, date_dim, customer\n      WHERE web_sales.ws_sold_date_sk = date_dim.d_date_sk\n        AND web_sales.ws_bill_customer_sk = customer.c_customer_sk\n        AND d_month_seq BETWEEN 1200 AND 1200 + 11)\n     ) cool_cust\n"
  },
  {
    "path": "spark-queries-tpcds/q88.sql",
    "content": "SELECT *\nFROM\n  (SELECT count(*) h8_30_to_9\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 8\n    AND time_dim.t_minute >= 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s1,\n  (SELECT count(*) h9_to_9_30\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 9\n    AND time_dim.t_minute < 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s2,\n  (SELECT count(*) h9_30_to_10\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 9\n    AND time_dim.t_minute >= 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s3,\n  (SELECT count(*) h10_to_10_30\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 10\n    AND time_dim.t_minute < 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s4,\n  (SELECT count(*) h10_30_to_11\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 10\n    AND time_dim.t_minute >= 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s5,\n  (SELECT count(*) h11_to_11_30\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 11\n    AND time_dim.t_minute < 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s6,\n  (SELECT count(*) h11_30_to_12\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 11\n    AND time_dim.t_minute >= 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s7,\n  (SELECT count(*) h12_to_12_30\n  FROM store_sales, household_demographics, time_dim, store\n  WHERE ss_sold_time_sk = time_dim.t_time_sk\n    AND ss_hdemo_sk = household_demographics.hd_demo_sk\n    AND ss_store_sk = s_store_sk\n    AND time_dim.t_hour = 12\n    AND time_dim.t_minute < 30\n    AND (\n    (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2)\n      OR\n      (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2)\n      OR\n      (household_demographics.hd_dep_count = 0 AND\n        household_demographics.hd_vehicle_count <= 0 + 2))\n    AND store.s_store_name = 'ese') s8\n"
  },
  {
    "path": "spark-queries-tpcds/q89.sql",
    "content": "SELECT *\nFROM (\n       SELECT\n         i_category,\n         i_class,\n         i_brand,\n         s_store_name,\n         s_company_name,\n         d_moy,\n         sum(ss_sales_price) sum_sales,\n         avg(sum(ss_sales_price))\n         OVER\n         (PARTITION BY i_category, i_brand, s_store_name, s_company_name)\n         avg_monthly_sales\n       FROM item, store_sales, date_dim, store\n       WHERE ss_item_sk = i_item_sk AND\n         ss_sold_date_sk = d_date_sk AND\n         ss_store_sk = s_store_sk AND\n         d_year IN (1999) AND\n         ((i_category IN ('Books', 'Electronics', 'Sports') AND\n           i_class IN ('computers', 'stereo', 'football'))\n           OR (i_category IN ('Men', 'Jewelry', 'Women') AND\n           i_class IN ('shirts', 'birdal', 'dresses')))\n       GROUP BY i_category, i_class, i_brand,\n         s_store_name, s_company_name, d_moy) tmp1\nWHERE CASE WHEN (avg_monthly_sales <> 0)\n  THEN (abs(sum_sales - avg_monthly_sales) / avg_monthly_sales)\n      ELSE NULL END > 0.1\nORDER BY sum_sales - avg_monthly_sales, s_store_name\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q9.sql",
    "content": "SELECT\n  CASE WHEN (SELECT count(*)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 1 AND 20) > 62316685\n    THEN (SELECT avg(ss_ext_discount_amt)\n    FROM store_sales\n    WHERE ss_quantity BETWEEN 1 AND 20)\n  ELSE (SELECT avg(ss_net_paid)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 1 AND 20) END bucket1,\n  CASE WHEN (SELECT count(*)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 21 AND 40) > 19045798\n    THEN (SELECT avg(ss_ext_discount_amt)\n    FROM store_sales\n    WHERE ss_quantity BETWEEN 21 AND 40)\n  ELSE (SELECT avg(ss_net_paid)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 21 AND 40) END bucket2,\n  CASE WHEN (SELECT count(*)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 41 AND 60) > 365541424\n    THEN (SELECT avg(ss_ext_discount_amt)\n    FROM store_sales\n    WHERE ss_quantity BETWEEN 41 AND 60)\n  ELSE (SELECT avg(ss_net_paid)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 41 AND 60) END bucket3,\n  CASE WHEN (SELECT count(*)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 61 AND 80) > 216357808\n    THEN (SELECT avg(ss_ext_discount_amt)\n    FROM store_sales\n    WHERE ss_quantity BETWEEN 61 AND 80)\n  ELSE (SELECT avg(ss_net_paid)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 61 AND 80) END bucket4,\n  CASE WHEN (SELECT count(*)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 81 AND 100) > 184483884\n    THEN (SELECT avg(ss_ext_discount_amt)\n    FROM store_sales\n    WHERE ss_quantity BETWEEN 81 AND 100)\n  ELSE (SELECT avg(ss_net_paid)\n  FROM store_sales\n  WHERE ss_quantity BETWEEN 81 AND 100) END bucket5\nFROM reason\nWHERE r_reason_sk = 1\n"
  },
  {
    "path": "spark-queries-tpcds/q90.sql",
    "content": "SELECT cast(amc AS DECIMAL(15, 4)) / cast(pmc AS DECIMAL(15, 4)) am_pm_ratio\nFROM (SELECT count(*) amc\nFROM web_sales, household_demographics, time_dim, web_page\nWHERE ws_sold_time_sk = time_dim.t_time_sk\n  AND ws_ship_hdemo_sk = household_demographics.hd_demo_sk\n  AND ws_web_page_sk = web_page.wp_web_page_sk\n  AND time_dim.t_hour BETWEEN 8 AND 8 + 1\n  AND household_demographics.hd_dep_count = 6\n  AND web_page.wp_char_count BETWEEN 5000 AND 5200) at,\n  (SELECT count(*) pmc\n  FROM web_sales, household_demographics, time_dim, web_page\n  WHERE ws_sold_time_sk = time_dim.t_time_sk\n    AND ws_ship_hdemo_sk = household_demographics.hd_demo_sk\n    AND ws_web_page_sk = web_page.wp_web_page_sk\n    AND time_dim.t_hour BETWEEN 19 AND 19 + 1\n    AND household_demographics.hd_dep_count = 6\n    AND web_page.wp_char_count BETWEEN 5000 AND 5200) pt\nORDER BY am_pm_ratio\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q91.sql",
    "content": "SELECT\n  cc_call_center_id Call_Center,\n  cc_name Call_Center_Name,\n  cc_manager Manager,\n  sum(cr_net_loss) Returns_Loss\nFROM\n  call_center, catalog_returns, date_dim, customer, customer_address,\n  customer_demographics, household_demographics\nWHERE\n  cr_call_center_sk = cc_call_center_sk\n    AND cr_returned_date_sk = d_date_sk\n    AND cr_returning_customer_sk = c_customer_sk\n    AND cd_demo_sk = c_current_cdemo_sk\n    AND hd_demo_sk = c_current_hdemo_sk\n    AND ca_address_sk = c_current_addr_sk\n    AND d_year = 1998\n    AND d_moy = 11\n    AND ((cd_marital_status = 'M' AND cd_education_status = 'Unknown')\n    OR (cd_marital_status = 'W' AND cd_education_status = 'Advanced Degree'))\n    AND hd_buy_potential LIKE 'Unknown%'\n    AND ca_gmt_offset = -7\nGROUP BY cc_call_center_id, cc_name, cc_manager, cd_marital_status, cd_education_status\nORDER BY sum(cr_net_loss) DESC\n"
  },
  {
    "path": "spark-queries-tpcds/q92.sql",
    "content": "SELECT sum(ws_ext_discount_amt) AS `Excess Discount Amount `\nFROM web_sales, item, date_dim\nWHERE i_manufact_id = 350\n  AND i_item_sk = ws_item_sk\n  AND d_date BETWEEN '2000-01-27' AND (cast('2000-01-27' AS DATE) + INTERVAL 90 days)\n  AND d_date_sk = ws_sold_date_sk\n  AND ws_ext_discount_amt >\n  (\n    SELECT 1.3 * avg(ws_ext_discount_amt)\n    FROM web_sales, date_dim\n    WHERE ws_item_sk = i_item_sk\n      AND d_date BETWEEN '2000-01-27' AND (cast('2000-01-27' AS DATE) + INTERVAL 90 days)\n      AND d_date_sk = ws_sold_date_sk\n  )\nORDER BY sum(ws_ext_discount_amt)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q93.sql",
    "content": "SELECT\n  ss_customer_sk,\n  sum(act_sales) sumsales\nFROM (SELECT\n  ss_item_sk,\n  ss_ticket_number,\n  ss_customer_sk,\n  CASE WHEN sr_return_quantity IS NOT NULL\n    THEN (ss_quantity - sr_return_quantity) * ss_sales_price\n  ELSE (ss_quantity * ss_sales_price) END act_sales\nFROM store_sales\n  LEFT OUTER JOIN store_returns\n    ON (sr_item_sk = ss_item_sk AND sr_ticket_number = ss_ticket_number)\n  ,\n  reason\nWHERE sr_reason_sk = r_reason_sk AND r_reason_desc = 'reason 28') t\nGROUP BY ss_customer_sk\nORDER BY sumsales, ss_customer_sk\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q94.sql",
    "content": "SELECT\n  count(DISTINCT ws_order_number) AS `order count `,\n  sum(ws_ext_ship_cost) AS `total shipping cost `,\n  sum(ws_net_profit) AS `total net profit `\nFROM\n  web_sales ws1, date_dim, customer_address, web_site\nWHERE\n  d_date BETWEEN '1999-02-01' AND\n  (CAST('1999-02-01' AS DATE) + INTERVAL 60 days)\n    AND ws1.ws_ship_date_sk = d_date_sk\n    AND ws1.ws_ship_addr_sk = ca_address_sk\n    AND ca_state = 'IL'\n    AND ws1.ws_web_site_sk = web_site_sk\n    AND web_company_name = 'pri'\n    AND EXISTS(SELECT *\n               FROM web_sales ws2\n               WHERE ws1.ws_order_number = ws2.ws_order_number\n                 AND ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)\n    AND NOT EXISTS(SELECT *\n                   FROM web_returns wr1\n                   WHERE ws1.ws_order_number = wr1.wr_order_number)\nORDER BY count(DISTINCT ws_order_number)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q95.sql",
    "content": "WITH ws_wh AS\n(SELECT\n    ws1.ws_order_number,\n    ws1.ws_warehouse_sk wh1,\n    ws2.ws_warehouse_sk wh2\n  FROM web_sales ws1, web_sales ws2\n  WHERE ws1.ws_order_number = ws2.ws_order_number\n    AND ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)\nSELECT\n  count(DISTINCT ws_order_number) AS `order count `,\n  sum(ws_ext_ship_cost) AS `total shipping cost `,\n  sum(ws_net_profit) AS `total net profit `\nFROM\n  web_sales ws1, date_dim, customer_address, web_site\nWHERE\n  d_date BETWEEN '1999-02-01' AND\n  (CAST('1999-02-01' AS DATE) + INTERVAL 60 DAY)\n    AND ws1.ws_ship_date_sk = d_date_sk\n    AND ws1.ws_ship_addr_sk = ca_address_sk\n    AND ca_state = 'IL'\n    AND ws1.ws_web_site_sk = web_site_sk\n    AND web_company_name = 'pri'\n    AND ws1.ws_order_number IN (SELECT ws_order_number\n  FROM ws_wh)\n    AND ws1.ws_order_number IN (SELECT wr_order_number\n  FROM web_returns, ws_wh\n  WHERE wr_order_number = ws_wh.ws_order_number)\nORDER BY count(DISTINCT ws_order_number)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q96.sql",
    "content": "SELECT count(*)\nFROM store_sales, household_demographics, time_dim, store\nWHERE ss_sold_time_sk = time_dim.t_time_sk\n  AND ss_hdemo_sk = household_demographics.hd_demo_sk\n  AND ss_store_sk = s_store_sk\n  AND time_dim.t_hour = 20\n  AND time_dim.t_minute >= 30\n  AND household_demographics.hd_dep_count = 7\n  AND store.s_store_name = 'ese'\nORDER BY count(*)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q97.sql",
    "content": "WITH ssci AS (\n  SELECT\n    ss_customer_sk customer_sk,\n    ss_item_sk item_sk\n  FROM store_sales, date_dim\n  WHERE ss_sold_date_sk = d_date_sk\n    AND d_month_seq BETWEEN 1200 AND 1200 + 11\n  GROUP BY ss_customer_sk, ss_item_sk),\n    csci AS (\n    SELECT\n      cs_bill_customer_sk customer_sk,\n      cs_item_sk item_sk\n    FROM catalog_sales, date_dim\n    WHERE cs_sold_date_sk = d_date_sk\n      AND d_month_seq BETWEEN 1200 AND 1200 + 11\n    GROUP BY cs_bill_customer_sk, cs_item_sk)\nSELECT\n  sum(CASE WHEN ssci.customer_sk IS NOT NULL AND csci.customer_sk IS NULL\n    THEN 1\n      ELSE 0 END) store_only,\n  sum(CASE WHEN ssci.customer_sk IS NULL AND csci.customer_sk IS NOT NULL\n    THEN 1\n      ELSE 0 END) catalog_only,\n  sum(CASE WHEN ssci.customer_sk IS NOT NULL AND csci.customer_sk IS NOT NULL\n    THEN 1\n      ELSE 0 END) store_and_catalog\nFROM ssci\n  FULL OUTER JOIN csci ON (ssci.customer_sk = csci.customer_sk\n    AND ssci.item_sk = csci.item_sk)\nLIMIT 100\n"
  },
  {
    "path": "spark-queries-tpcds/q98.sql",
    "content": "SELECT\n  i_item_desc,\n  i_category,\n  i_class,\n  i_current_price,\n  sum(ss_ext_sales_price) AS itemrevenue,\n  sum(ss_ext_sales_price) * 100 / sum(sum(ss_ext_sales_price))\n  OVER\n  (PARTITION BY i_class) AS revenueratio\nFROM\n  store_sales, item, date_dim\nWHERE\n  ss_item_sk = i_item_sk\n    AND i_category IN ('Sports', 'Books', 'Home')\n    AND ss_sold_date_sk = d_date_sk\n    AND d_date BETWEEN cast('1999-02-22' AS DATE)\n  AND (cast('1999-02-22' AS DATE) + INTERVAL 30 days)\nGROUP BY\n  i_item_id, i_item_desc, i_category, i_class, i_current_price\nORDER BY\n  i_category, i_class, i_item_id, i_item_desc, revenueratio\n"
  },
  {
    "path": "spark-queries-tpcds/q99.sql",
    "content": "SELECT\n  substr(w_warehouse_name, 1, 20),\n  sm_type,\n  cc_name,\n  sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk <= 30)\n    THEN 1\n      ELSE 0 END)  AS `30 days `,\n  sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 30) AND\n    (cs_ship_date_sk - cs_sold_date_sk <= 60)\n    THEN 1\n      ELSE 0 END)  AS `31 - 60 days `,\n  sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 60) AND\n    (cs_ship_date_sk - cs_sold_date_sk <= 90)\n    THEN 1\n      ELSE 0 END)  AS `61 - 90 days `,\n  sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 90) AND\n    (cs_ship_date_sk - cs_sold_date_sk <= 120)\n    THEN 1\n      ELSE 0 END)  AS `91 - 120 days `,\n  sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 120)\n    THEN 1\n      ELSE 0 END)  AS `>120 days `\nFROM\n  catalog_sales, warehouse, ship_mode, call_center, date_dim\nWHERE\n  d_month_seq BETWEEN 1200 AND 1200 + 11\n    AND cs_ship_date_sk = d_date_sk\n    AND cs_warehouse_sk = w_warehouse_sk\n    AND cs_ship_mode_sk = sm_ship_mode_sk\n    AND cs_call_center_sk = cc_call_center_sk\nGROUP BY\n  substr(w_warehouse_name, 1, 20), sm_type, cc_name\nORDER BY substr(w_warehouse_name, 1, 20), sm_type, cc_name\nLIMIT 100\n"
  },
  {
    "path": "tpcds-build.sh",
    "content": "#!/bin/sh\n\n# Check for all the stuff I need to function.\nfor f in gcc javac; do\n\twhich $f > /dev/null 2>&1\n\tif [ $? -ne 0 ]; then\n\t\techo \"Required program $f is missing. Please install or fix your path and try again.\"\n\t\texit 1\n\tfi\ndone\n\n# Check if Maven is installed and install it if not.\nwhich mvn > /dev/null 2>&1\nif [ $? -ne 0 ]; then\n\tSKIP=0\n\tif [ -e \"apache-maven-3.0.5-bin.tar.gz\" ]; then\n\t\tSIZE=`du -b apache-maven-3.0.5-bin.tar.gz | cut -f 1`\n\t\tif [ $SIZE -eq 5144659 ]; then\n\t\t\tSKIP=1\n\t\tfi\n\tfi\n\tif [ $SKIP -ne 1 ]; then\n\t\techo \"Maven not found, automatically installing it.\"\n\t\tcurl -O https://downloads.apache.org/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz 2> /dev/null\n\t\tif [ $? -ne 0 ]; then\n\t\t\techo \"Failed to download Maven, check Internet connectivity and try again.\"\n\t\t\texit 1\n\t\tfi\n\tfi\n\ttar -zxf apache-maven-3.0.5-bin.tar.gz > /dev/null\n\tCWD=$(pwd)\n\texport MAVEN_HOME=\"$CWD/apache-maven-3.0.5\"\n\texport PATH=$PATH:$MAVEN_HOME/bin\nfi\n\necho \"Building TPC-DS Data Generator\"\n(cd tpcds-gen; make)\necho \"TPC-DS Data Generator built, you can now use tpcds-setup.sh to generate data.\"\n"
  },
  {
    "path": "tpcds-gen/Makefile",
    "content": "\nall: target/lib/dsdgen.jar target/tpcds-gen-1.0-SNAPSHOT.jar\n\ntarget/tpcds-gen-1.0-SNAPSHOT.jar: $(shell find -name *.java) \n\tmvn package\n\ntarget/tpcds_kit.zip: tpcds_kit.zip\n\tmkdir -p target/\n\tcp tpcds_kit.zip target/tpcds_kit.zip\n\ntpcds_kit.zip:\n\tcurl https://public-repo-1.hortonworks.com/hive-testbench/tpcds/README\n\tcurl --output tpcds_kit.zip https://public-repo-1.hortonworks.com/hive-testbench/tpcds/TPCDS_Tools.zip\n\ntarget/lib/dsdgen.jar: target/tools/dsdgen\n\tcd target/; mkdir -p lib/; ( jar cvf lib/dsdgen.jar tools/ || gjar cvf lib/dsdgen.jar tools/ )\n\ntarget/tools/dsdgen: target/tpcds_kit.zip\n\ttest -d target/tools/ || (cd target; unzip tpcds_kit.zip)\n\ttest -d target/tools/ || (cd target; mv */tools tools)\n\tcd target/tools; cat ../../patches/all/*.patch | patch -p0\n\tcd target/tools; cat ../../patches/${MYOS}/*.patch | patch -p1\n\tcd target/tools; make clean; make dsdgen\n\nclean:\n\tmvn clean\n"
  },
  {
    "path": "tpcds-gen/README.md",
    "content": "Mapreduce TPC-DS Generator\n==========================\n\nThis simplifies creating tpc-ds data-sets on large scales on a hadoop cluster.\n\nTo get set up, you need to run\n\n\t$ make \n\nthis will download the TPC-DS dsgen program, compile it and use maven to build the MR app wrapped around it.\n\nTo generate the data-sets, you need to run (say, for scale = 200, parallelism = 100)\n\n\t$ hadoop  jar target/tpcds-gen-1.0-SNAPSHOT.jar   -d /tmp/store_sales/200/ -p 100 -s 200 \n\nThis uses the existing parallelism in the driver.c of TPC-DS without modification and uses it to run the command on multiple machines instead of running in local fork mode.\n\nThe command generates multiple files for each map task, resulting in each table having its own subdirectory.\n\nAssumptions made are that all machines in the cluster are OS/arch/lib identical.\n"
  },
  {
    "path": "tpcds-gen/patches/Darwin/macosx.patch",
    "content": "diff -rupN tools/Makefile.suite toolsnew/Makefile.suite\n--- tools/Makefile.suite\t2012-04-25 11:03:50.000000000 -0700\n+++ toolsnew/Makefile.suite\t2014-06-25 13:15:00.000000000 -0700\n@@ -38,8 +38,8 @@\n ################\n ## TARGET OS HERE\n ################\n-# OS Values: AIX, LINUX, SOLARIS, NCR, HPUX\n-OS\t=\tLINUX\n+# OS Values: AIX, LINUX, SOLARIS, NCR, HPUX, OSX\n+OS\t=\tOSX\n ###########\n # No changes should be necessary below this point\n # Each compile variable is adjusted for the target platform using the OS setting above\n@@ -47,7 +47,8 @@ OS\t=\tLINUX\n # CC\n AIX_CC\t\t= xlC\n HPUX_CC\t\t= gcc\n-LINUX_CC\t\t= gcc\n+LINUX_CC\t= gcc\n+OSX_CC\t\t= gcc\n NCR_CC\t\t= cc\n SOLARIS_CC\t= gcc\n SOL86_CC\t= cc\n@@ -55,7 +56,8 @@ CC\t\t= $($(OS)_CC)\n # CFLAGS\n AIX_CFLAGS\t\t= -q64 -O3 -D_LARGE_FILES\n HPUX_CFLAGS\t\t= -O3 -Wall\n-LINUX_CFLAGS\t= -g -Wall\n+LINUX_CFLAGS\t\t= -g -Wall\n+OSX_CFLAGS\t\t= -g -Wall\n NCR_CFLAGS\t\t= -g \n SOLARIS_CFLAGS\t= -O3 -Wall\n SOL86_CFLAGS\t= -O3 \n@@ -65,6 +67,7 @@ CFLAGS\t\t\t= $(BASE_CFLAGS) -D$(OS) $($(OS\n AIX_EXE\t= \n HPUX_EXE\t= \n LINUX_EXE\t= \n+OSX_EXE\t\t= \n NCR_EXE\t\t= \n SOLARIS_EXE\t= \n SOL86_EXE\t= \n@@ -73,6 +76,7 @@ EXE\t\t= $($(OS)_EXE)\n AIX_LEX\t\t= flex\n HPUX_LEX\t= flex\n LINUX_LEX\t= lex\n+OSX_LEX\t\t= lex\n NCR_LEX\t\t= lex\n SOLARIS_LEX\t= lex\n SOL86_LEX\t= lex\n@@ -81,6 +85,7 @@ LEX\t\t= $($(OS)_LEX)\n AIX_LIBS\t= -lm\n HPUX_LIBS\t= -lm -ll\n LINUX_LIBS\t= -lm\n+OSX_LIBS\t= -lm\n NCR_LIBS\t= -lm -lc89\n SOLARIS_LIBS\t= -ly -ll -lm\n SOL86_LIBS\t= -ly -ll -lm\n@@ -89,6 +94,7 @@ LIBS\t\t= $($(OS)_LIBS)\n AIX_YACC\t= yacc\n HPUX_YACC\t= bison -y\n LINUX_YACC\t= yacc\n+OSX_YACC\t= yacc\n NCR_YACC\t= yacc\n SOLARIS_YACC\t= yacc\n SOL86_YACC\t= yacc\n@@ -97,6 +103,7 @@ YACC\t\t= $($(OS)_YACC)\n AIX_YFLAGS\t= -d -v\n HPUX_YFLAGS\t= -y -d -v\n LINUX_YFLAGS\t= -d -v\n+OSX_YFLAGS\t= -d -v\n NCR_YFLAGS\t= -d -v\n SOLARIS_YFLAGS\t= -d -v\n SOL86_YFLAGS\t= -d -v\ndiff -rupN tools/config.h toolsnew/config.h\n--- tools/config.h\t2012-04-25 11:03:52.000000000 -0700\n+++ toolsnew/config.h\t2014-06-25 13:15:00.000000000 -0700\n@@ -109,6 +109,18 @@\n #define FLEX\n #endif /* LINUX */\n \n+#ifdef OSX\n+#define SUPPORT_64BITS\n+#define HUGE_TYPE       int64_t\n+#define HUGE_FORMAT     \"%lld\"\n+#define HUGE_COUNT      1\n+#define USE_STRING_H\n+#define USE_LIMITS_H\n+#define MAXINT INT_MAX\n+#define USE_STDLIB_H\n+#define FLEX\n+#endif /* OSX */\n+\n #ifdef SOLARIS\n #define SUPPORT_64BITS\n #define HUGE_TYPE\tlong long \ndiff -rupN tools/makefile toolsnew/makefile\n--- tools/makefile\t2012-04-25 11:03:54.000000000 -0700\n+++ toolsnew/makefile\t2014-06-25 13:15:00.000000000 -0700\n@@ -38,8 +38,8 @@\n ################\n ## TARGET OS HERE\n ################\n-# OS Values: AIX, LINUX, SOLARIS, NCR, HPUX\n-OS\t=\tLINUX\n+# OS Values: AIX, LINUX, SOLARIS, NCR, HPUX, OSX\n+OS\t=\tOSX\n ###########\n # No changes should be necessary below this point\n # Each compile variable is adjusted for the target platform using the OS setting above\n@@ -47,7 +47,8 @@ OS\t=\tLINUX\n # CC\n AIX_CC\t\t= xlC\n HPUX_CC\t\t= gcc\n-LINUX_CC\t\t= gcc\n+LINUX_CC\t= gcc\n+OSX_CC\t\t= gcc\n NCR_CC\t\t= cc\n SOLARIS_CC\t= gcc\n SOL86_CC\t= cc\n@@ -56,6 +57,7 @@ CC\t\t= $($(OS)_CC)\n AIX_CFLAGS\t\t= -q64 -O3 -D_LARGE_FILES\n HPUX_CFLAGS\t\t= -O3 -Wall\n LINUX_CFLAGS\t= -g -Wall\n+OSX_CFLAGS\t= -g -Wall -I/usr/include/malloc\n NCR_CFLAGS\t\t= -g \n SOLARIS_CFLAGS\t= -O3 -Wall\n SOL86_CFLAGS\t= -O3 \n@@ -65,6 +67,7 @@ CFLAGS\t\t\t= $(BASE_CFLAGS) -D$(OS) $($(OS\n AIX_EXE\t= \n HPUX_EXE\t= \n LINUX_EXE\t= \n+OSX_EXE\t= \n NCR_EXE\t\t= \n SOLARIS_EXE\t= \n SOL86_EXE\t= \n@@ -73,6 +76,7 @@ EXE\t\t= $($(OS)_EXE)\n AIX_LEX\t\t= flex\n HPUX_LEX\t= flex\n LINUX_LEX\t= lex\n+OSX_LEX\t\t= lex\n NCR_LEX\t\t= lex\n SOLARIS_LEX\t= lex\n SOL86_LEX\t= lex\n@@ -81,6 +85,7 @@ LEX\t\t= $($(OS)_LEX)\n AIX_LIBS\t= -lm\n HPUX_LIBS\t= -lm -ll\n LINUX_LIBS\t= -lm\n+OSX_LIBS\t= -lm\n NCR_LIBS\t= -lm -lc89\n SOLARIS_LIBS\t= -ly -ll -lm\n SOL86_LIBS\t= -ly -ll -lm\n@@ -89,6 +94,7 @@ LIBS\t\t= $($(OS)_LIBS)\n AIX_YACC\t= yacc\n HPUX_YACC\t= bison -y\n LINUX_YACC\t= yacc\n+OSX_YACC\t= yacc\n NCR_YACC\t= yacc\n SOLARIS_YACC\t= yacc\n SOL86_YACC\t= yacc\n@@ -97,6 +103,7 @@ YACC\t\t= $($(OS)_YACC)\n AIX_YFLAGS\t= -d -v\n HPUX_YFLAGS\t= -y -d -v\n LINUX_YFLAGS\t= -d -v\n+OSX_YFLAGS\t= -d -v\n NCR_YFLAGS\t= -d -v\n SOLARIS_YFLAGS\t= -d -v\n SOL86_YFLAGS\t= -d -v\n"
  },
  {
    "path": "tpcds-gen/patches/all/tpcds-buffered.patch",
    "content": "diff --git print.c print.c\nindex 1b64362..5108bd7 100644\n--- print.c\n+++ print.c\n@@ -68,6 +68,7 @@ print_close(int tbl)\n \tfpOutfile = NULL;\n \tif (pTdef->outfile)\n \t{\n+\t\tfflush(pTdef->outfile);\n \t\tfclose(pTdef->outfile);\n \t\tpTdef->outfile = NULL;\n \t}\n@@ -536,7 +538,7 @@ print_end (int tbl)\n    if (add_term)\n       fwrite(term, 1, add_term, fpOutfile);\n    fprintf (fpOutfile, \"\\n\");\n-   fflush(fpOutfile);\n+   //fflush(fpOutfile);\n \n    return (res);\n }\n"
  },
  {
    "path": "tpcds-gen/patches/all/tpcds-strcpy.patch",
    "content": "diff --git r_params.c r_params.c\nindex 4db16e5..9b1a8e6 100644\n--- r_params.c\n+++ r_params.c\n@@ -46,7 +46,7 @@\n #include \"tdefs.h\"\n #include \"release.h\"\n \n-#define PARAM_MAX_LEN\t80\n+#define PARAM_MAX_LEN\tPATH_MAX\n \n #ifndef TEST\n extern option_t options[];\n@@ -275,7 +275,7 @@ set_str(char *var, char *val)\n \tnParam = fnd_param(var);\n \tif (nParam >= 0)\n \t{\n-\t\tstrcpy(params[options[nParam].index], val);\n+\t\tstrncpy(params[options[nParam].index], val, PARAM_MAX_LEN);\n \t\toptions[nParam].flags |= OPT_SET;\n \t}\n \n"
  },
  {
    "path": "tpcds-gen/patches/all/tpcds_misspelled_header_guard.patch",
    "content": "--- w_store_sales.h.orig\t2014-06-25 10:58:19.000000000 -0700\n+++ w_store_sales.h\t2014-06-25 10:58:51.000000000 -0700\n@@ -34,7 +34,7 @@\n  * Gradient Systems\n  */ \n #ifndef W_STORE_SALES_H\n-#define W_STORE_SLAES_H\n+#define W_STORE_SALES_H\n \n #include \"constants.h\"\n #include \"pricing.h\"\n"
  },
  {
    "path": "tpcds-gen/pom.xml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n<project xmlns=\"http://maven.apache.org/POM/4.0.0\"\n    xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\n    xsi:schemaLocation=\"http://maven.apache.org/POM/4.0.0\n                        http://maven.apache.org/maven-v4_0_0.xsd\">\n\n  <modelVersion>4.0.0</modelVersion>\n\n  <groupId>org.notmysock.tpcds</groupId>\n  <artifactId>tpcds-gen</artifactId>\n  <version>1.0-SNAPSHOT</version>\n  <packaging>jar</packaging>\n\n  <name>tpcds-gen</name>\n  <url>http://maven.apache.org</url>\n\n  <dependencies>\n    <dependency>\n      <groupId>org.apache.hadoop</groupId>\n      <artifactId>hadoop-client</artifactId>\n      <version>2.2.0</version>\n      <scope>compile</scope>\n    </dependency>\n    <dependency>\n      <groupId>commons-cli</groupId>\n      <artifactId>commons-cli</artifactId>\n      <version>1.1</version>\n      <scope>compile</scope>\n    </dependency>\n    <dependency>\n      <groupId>org.mockito</groupId>\n      <artifactId>mockito-core</artifactId>\n      <version>1.8.5</version>\n      <scope>test</scope>\n    </dependency>\n    <dependency>\n      <groupId>junit</groupId>\n      <artifactId>junit</artifactId>\n      <version>4.7</version>\n      <scope>test</scope>\n    </dependency>\n  </dependencies>\n\n  <build>\n    <plugins>\n      <plugin>\n        <artifactId>maven-compiler-plugin</artifactId>\n        <configuration>\n          <source>1.6</source>\n          <target>1.6</target>\n        </configuration>\n      </plugin>\n      <plugin>\n        <groupId>org.apache.maven.plugins</groupId>\n        <artifactId>maven-jar-plugin</artifactId>\n        <configuration>\n          <archive>\n            <manifest>\n              <addClasspath>true</addClasspath>\n\t\t\t  <classpathPrefix>lib/</classpathPrefix>\n              <mainClass>org.notmysock.tpcds.GenTable</mainClass>\n            </manifest>\n          </archive>\n        </configuration>\n      </plugin>\n\t  <plugin>\n\t\t<groupId>org.apache.maven.plugins</groupId>\n\t\t<artifactId>maven-dependency-plugin</artifactId>\n        <executions>\n\t\t  <execution>\n            <id>copy-dependencies</id>\n            <phase>package</phase>\n            <goals>\n\t\t\t  <goal>copy-dependencies</goal>\n            </goals>\n            <configuration>\n                <outputDirectory>${project.build.directory}/lib</outputDirectory>\n            </configuration>\n          </execution>\n        </executions>\n      </plugin>\n    </plugins>\n  </build>\n\n  <pluginRepositories>\n    <pluginRepository>\n      <id>central</id>\n        <name>Central Repository</name>\n        <url>https://repo.maven.apache.org/maven2</url>\n        <layout>default</layout>\n        <snapshots>\n          <enabled>false</enabled>\n        </snapshots>\n        <releases>\n          <updatePolicy>never</updatePolicy>\n        </releases>\n    </pluginRepository>\n  </pluginRepositories>\n  <repositories>\n    <repository>\n      <id>central</id>\n      <name>Central Repository</name>\n      <url>https://repo.maven.apache.org/maven2</url>\n      <layout>default</layout>\n      <snapshots>\n        <enabled>false</enabled>\n      </snapshots>\n    </repository>\n  </repositories>\n\n</project>\n"
  },
  {
    "path": "tpcds-gen/src/main/java/org/notmysock/tpcds/GenTable.java",
    "content": "package org.notmysock.tpcds;\n\nimport org.apache.hadoop.conf.*;\nimport org.apache.hadoop.fs.*;\nimport org.apache.hadoop.hdfs.*;\nimport org.apache.hadoop.io.*;\nimport org.apache.hadoop.util.*;\nimport org.apache.hadoop.filecache.*;\nimport org.apache.hadoop.mapreduce.*;\nimport org.apache.hadoop.mapreduce.lib.input.*;\nimport org.apache.hadoop.mapreduce.lib.output.*;\nimport org.apache.hadoop.mapreduce.lib.reduce.*;\n\nimport org.apache.commons.cli.*;\nimport org.apache.commons.*;\n\nimport java.io.*;\nimport java.nio.*;\nimport java.util.*;\nimport java.net.*;\nimport java.math.*;\nimport java.security.*;\n\n\npublic class GenTable extends Configured implements Tool {\n    public static void main(String[] args) throws Exception {\n        Configuration conf = new Configuration();\n        int res = ToolRunner.run(conf, new GenTable(), args);\n        System.exit(res);\n    }\n\n    @Override\n    public int run(String[] args) throws Exception {\n        String[] remainingArgs = new GenericOptionsParser(getConf(), args).getRemainingArgs();\n\n        CommandLineParser parser = new BasicParser();\n        getConf().setInt(\"io.sort.mb\", 4);\n        org.apache.commons.cli.Options options = new org.apache.commons.cli.Options();\n        options.addOption(\"s\",\"scale\", true, \"scale\");\n        options.addOption(\"t\",\"table\", true, \"table\");\n        options.addOption(\"d\",\"dir\", true, \"dir\");\n        options.addOption(\"p\", \"parallel\", true, \"parallel\");\n        CommandLine line = parser.parse(options, remainingArgs);\n\n        if(!(line.hasOption(\"scale\") && line.hasOption(\"dir\"))) {\n          HelpFormatter f = new HelpFormatter();\n          f.printHelp(\"GenTable\", options);\n          return 1;\n        }\n        \n        int scale = Integer.parseInt(line.getOptionValue(\"scale\"));\n        String table = \"all\";\n        if(line.hasOption(\"table\")) {\n          table = line.getOptionValue(\"table\");\n        }\n        Path out = new Path(line.getOptionValue(\"dir\"));\n\n        int parallel = scale;\n\n        if(line.hasOption(\"parallel\")) {\n          parallel = Integer.parseInt(line.getOptionValue(\"parallel\"));\n        }\n\n        if(parallel == 1 || scale == 1) {\n          System.err.println(\"The MR task does not work for scale=1 or parallel=1\");\n          return 1;\n        }\n\n        Path in = genInput(table, scale, parallel);\n\n        Path dsdgen = copyJar(new File(\"target/lib/dsdgen.jar\"));\n        URI dsuri = dsdgen.toUri();\n        URI link = new URI(dsuri.getScheme(),\n                    dsuri.getUserInfo(), dsuri.getHost(), \n                    dsuri.getPort(),dsuri.getPath(), \n                    dsuri.getQuery(),\"dsdgen\");\n        Configuration conf = getConf();\n        conf.setInt(\"mapred.task.timeout\",0);\n        conf.setInt(\"mapreduce.task.timeout\",0);\n        conf.setBoolean(\"mapreduce.map.output.compress\", true);\n        conf.set(\"mapreduce.map.output.compress.codec\", \"org.apache.hadoop.io.compress.GzipCodec\");\n        DistributedCache.addCacheArchive(link, conf);\n        DistributedCache.createSymlink(conf);\n        Job job = new Job(conf, \"GenTable+\"+table+\"_\"+scale);\n        job.setJarByClass(getClass());\n        job.setNumReduceTasks(0);\n        job.setMapperClass(DSDGen.class);\n        job.setOutputKeyClass(Text.class);\n        job.setOutputValueClass(Text.class);\n\n        job.setInputFormatClass(NLineInputFormat.class);\n        NLineInputFormat.setNumLinesPerSplit(job, 1);\n\n        FileInputFormat.addInputPath(job, in);\n        FileOutputFormat.setOutputPath(job, out);\n\n        // use multiple output to only write the named files\n        LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);\n        MultipleOutputs.addNamedOutput(job, \"text\", \n          TextOutputFormat.class, LongWritable.class, Text.class);\n\n        boolean success = job.waitForCompletion(true);\n\n        // cleanup\n        FileSystem fs = FileSystem.get(getConf());\n        \n        fs.delete(in, false);\n        fs.delete(dsdgen, false);\n\n        return 0;\n    }\n\n    public Path copyJar(File jar) throws Exception {\n      MessageDigest md = MessageDigest.getInstance(\"MD5\");\n      InputStream is = new FileInputStream(jar);\n      try {\n        is = new DigestInputStream(is, md);\n        // read stream to EOF as normal...\n      }\n      finally {\n        is.close();\n      }\n      BigInteger md5 = new BigInteger(md.digest()); \n      String md5hex = md5.toString(16);\n      Path dst = new Path(String.format(\"/tmp/%s.jar\",md5hex));\n      Path src = new Path(jar.toURI());\n      FileSystem fs = FileSystem.get(getConf());\n      fs.copyFromLocalFile(false, /*overwrite*/true, src, dst);\n      return dst; \n    }\n\n    public Path genInput(String table, int scale, int parallel) throws Exception {\n        long epoch = System.currentTimeMillis()/1000;\n\n        Path in = new Path(\"/tmp/\"+table+\"_\"+scale+\"-\"+epoch);\n        FileSystem fs = FileSystem.get(getConf());\n        FSDataOutputStream out = fs.create(in);\n        for(int i = 1; i <= parallel; i++) {\n          if(table.equals(\"all\")) {\n            out.writeBytes(String.format(\"./dsdgen -dir $DIR -force Y -scale %d -parallel %d -child %d\\n\", scale, parallel, i));\n          } else {\n            out.writeBytes(String.format(\"./dsdgen -dir $DIR -table %s -force Y -scale %d -parallel %d -child %d\\n\", table, scale, parallel, i));\n          }\n        }\n        out.close();\n        return in;\n    }\n\n    static String readToString(InputStream in) throws IOException {\n      InputStreamReader is = new InputStreamReader(in);\n      StringBuilder sb=new StringBuilder();\n      BufferedReader br = new BufferedReader(is);\n      String read = br.readLine();\n\n      while(read != null) {\n        //System.out.println(read);\n        sb.append(read);\n        read =br.readLine();\n      }\n      return sb.toString();\n    }\n\n    static final class DSDGen extends Mapper<LongWritable,Text, Text, Text> {\n      private MultipleOutputs mos;\n      protected void setup(Context context) throws IOException {\n        mos = new MultipleOutputs(context);\n      }\n      protected void cleanup(Context context) throws IOException, InterruptedException {\n        mos.close();\n      }\n      protected void map(LongWritable offset, Text command, Mapper.Context context) \n        throws IOException, InterruptedException {\n        String parallel=\"1\";\n        String child=\"1\";\n\n        String[] cmd = command.toString().split(\" \");\n\n        for(int i=0; i<cmd.length; i++) {\n          if(cmd[i].equals(\"$DIR\")) {\n            cmd[i] = (new File(\".\")).getAbsolutePath();\n          }\n          if(cmd[i].equals(\"-parallel\")) {\n            parallel = cmd[i+1];\n          }\n          if(cmd[i].equals(\"-child\")) {\n            child = cmd[i+1];\n          }\n        }\n\n        Process p = Runtime.getRuntime().exec(cmd, null, new File(\"dsdgen/tools/\"));\n        int status = p.waitFor();\n        if(status != 0) {\n          String err = readToString(p.getErrorStream());\n          throw new InterruptedException(\"Process failed with status code \" + status + \"\\n\" + err);\n        }\n\n        File cwd = new File(\".\");\n        final String suffix = String.format(\"_%s_%s.dat\", child, parallel);\n\n        FilenameFilter tables = new FilenameFilter() {\n          public boolean accept(File dir, String name) {\n            return name.endsWith(suffix);\n          }\n        };\n\n        for(File f: cwd.listFiles(tables)) {\n          BufferedReader br = new BufferedReader(new FileReader(f));          \n          String line;\n          while ((line = br.readLine()) != null) {\n            // process the line.\n            mos.write(\"text\", line, null, f.getName().replace(suffix,\"/data\"));\n          }\n          br.close();\n          f.deleteOnExit();\n        }\n      }\n    }\n}\n"
  },
  {
    "path": "tpcds-setup.sh",
    "content": "#!/bin/bash\n\nfunction usage {\n\techo \"Usage: tpcds-setup.sh scale_factor [temp_directory]\"\n\texit 1\n}\n\nfunction runcommand {\n\tif [ \"X$DEBUG_SCRIPT\" != \"X\" ]; then\n\t\t$1\n\telse\n\t\t$1 2>/dev/null\n\tfi\n}\n\nif [ ! -f tpcds-gen/target/tpcds-gen-1.0-SNAPSHOT.jar ]; then\n\techo \"Please build the data generator with ./tpcds-build.sh first\"\n\texit 1\nfi\nwhich hive > /dev/null 2>&1\nif [ $? -ne 0 ]; then\n\techo \"Script must be run where Hive is installed\"\n\texit 1\nfi\n\n# Tables in the TPC-DS schema.\nDIMS=\"date_dim time_dim item customer customer_demographics household_demographics customer_address store promotion warehouse ship_mode reason income_band call_center web_page catalog_page web_site\"\nFACTS=\"store_sales store_returns web_sales web_returns catalog_sales catalog_returns inventory\"\n\n# Get the parameters.\nSCALE=$1\nDIR=$2\nif [ \"X$BUCKET_DATA\" != \"X\" ]; then\n\tBUCKETS=13\n\tRETURN_BUCKETS=13\nelse\n\tBUCKETS=1\n\tRETURN_BUCKETS=1\nfi\nif [ \"X$DEBUG_SCRIPT\" != \"X\" ]; then\n\tset -x\nfi\n\n# Sanity checking.\nif [ X\"$SCALE\" = \"X\" ]; then\n\tusage\nfi\nif [ X\"$DIR\" = \"X\" ]; then\n\tDIR=/tmp/tpcds-generate\nfi\nif [ $SCALE -eq 1 ]; then\n\techo \"Scale factor must be greater than 1\"\n\texit 1\nfi\n\n# Do the actual data load.\nhdfs dfs -mkdir -p ${DIR}\nhdfs dfs -ls ${DIR}/${SCALE} > /dev/null\nif [ $? -ne 0 ]; then\n\techo \"Generating data at scale factor $SCALE.\"\n\t(cd tpcds-gen; hadoop jar target/*.jar -d ${DIR}/${SCALE}/ -s ${SCALE})\nfi\nhdfs dfs -ls ${DIR}/${SCALE} > /dev/null\nif [ $? -ne 0 ]; then\n\techo \"Data generation failed, exiting.\"\n\texit 1\nfi\n\nhadoop fs -chmod -R 777  ${DIR}/${SCALE}\n\necho \"TPC-DS text data generation complete.\"\n\nHIVE=\"beeline -n hive -u 'jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2?tez.queue.name=default' \"\n\n# Create the text/flat tables as external tables. These will be later be converted to ORCFile.\necho \"Loading text data into external tables.\"\nruncommand \"$HIVE  -i settings/load-flat.sql -f ddl-tpcds/text/alltables.sql --hivevar DB=tpcds_text_${SCALE} --hivevar LOCATION=${DIR}/${SCALE}\"\n\n# Create the partitioned and bucketed tables.\nif [ \"X$FORMAT\" = \"X\" ]; then\n\tFORMAT=orc\nfi\n\nLOAD_FILE=\"load_${FORMAT}_${SCALE}.mk\"\nSILENCE=\"2> /dev/null 1> /dev/null\" \nif [ \"X$DEBUG_SCRIPT\" != \"X\" ]; then\n\tSILENCE=\"\"\nfi\n\necho -e \"all: ${DIMS} ${FACTS}\" > $LOAD_FILE\n\ni=1\ntotal=24\nDATABASE=tpcds_bin_partitioned_${FORMAT}_${SCALE}\nMAX_REDUCERS=2500 # maximum number of useful reducers for any scale \nREDUCERS=$((test ${SCALE} -gt ${MAX_REDUCERS} && echo ${MAX_REDUCERS}) || echo ${SCALE})\n\n# Populate the smaller tables.\nfor t in ${DIMS}\ndo\n\tCOMMAND=\"$HIVE  -i settings/load-partitioned.sql -f ddl-tpcds/bin_partitioned/${t}.sql \\\n\t    --hivevar DB=${DATABASE} --hivevar SOURCE=tpcds_text_${SCALE} \\\n            --hivevar SCALE=${SCALE} \\\n\t    --hivevar REDUCERS=${REDUCERS} \\\n\t    --hivevar FILE=${FORMAT}\"\n\techo -e \"${t}:\\n\\t@$COMMAND $SILENCE && echo 'Optimizing table $t ($i/$total).'\" >> $LOAD_FILE\n\ti=`expr $i + 1`\ndone\n\nfor t in ${FACTS}\ndo\n\tCOMMAND=\"$HIVE  -i settings/load-partitioned.sql -f ddl-tpcds/bin_partitioned/${t}.sql \\\n\t    --hivevar DB=${DATABASE} \\\n            --hivevar SCALE=${SCALE} \\\n\t    --hivevar SOURCE=tpcds_text_${SCALE} --hivevar BUCKETS=${BUCKETS} \\\n\t    --hivevar RETURN_BUCKETS=${RETURN_BUCKETS} --hivevar REDUCERS=${REDUCERS} --hivevar FILE=${FORMAT}\"\n\techo -e \"${t}:\\n\\t@$COMMAND $SILENCE && echo 'Optimizing table $t ($i/$total).'\" >> $LOAD_FILE\n\ti=`expr $i + 1`\ndone\n\nmake -j 1 -f $LOAD_FILE\n\n\necho \"Loading constraints\"\nruncommand \"$HIVE -f ddl-tpcds/bin_partitioned/add_constraints.sql --hivevar DB=${DATABASE}\"\n\necho \"Data loaded into database ${DATABASE}.\"\n"
  },
  {
    "path": "tpch-build.sh",
    "content": "#!/bin/sh\n\n# Check for all the stuff I need to function.\nfor f in gcc javac; do\n\twhich $f > /dev/null 2>&1\n\tif [ $? -ne 0 ]; then\n\t\techo \"Required program $f is missing. Please install or fix your path and try again.\"\n\t\texit 1\n\tfi\ndone\n\n# Check if Maven is installed and install it if not.\nwhich mvn > /dev/null 2>&1\nif [ $? -ne 0 ]; then\n\tSKIP=0\n\tif [ -e \"apache-maven-3.0.5-bin.tar.gz\" ]; then\n\t\tSIZE=`du -b apache-maven-3.0.5-bin.tar.gz | cut -f 1`\n\t\tif [ $SIZE -eq 5144659 ]; then\n\t\t\tSKIP=1\n\t\tfi\n\tfi\n\tif [ $SKIP -ne 1 ]; then\n\t\techo \"Maven not found, automatically installing it.\"\n\t\tcurl -O https://downloads.apache.org/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz 2> /dev/null\n\t\tif [ $? -ne 0 ]; then\n\t\t\techo \"Failed to download Maven, check Internet connectivity and try again.\"\n\t\t\texit 1\n\t\tfi\n\tfi\n\ttar -zxf apache-maven-3.0.5-bin.tar.gz > /dev/null\n\tCWD=$(pwd)\n\texport MAVEN_HOME=\"$CWD/apache-maven-3.0.5\"\n\texport PATH=$PATH:$MAVEN_HOME/bin\nfi\n\necho \"Building TPC-H Data Generator\"\n(cd tpch-gen; make)\necho \"TPC-H Data Generator built, you can now use tpch-setup.sh to generate data.\"\n"
  },
  {
    "path": "tpch-gen/Makefile",
    "content": "MYOS=$(shell uname -s)\n\nall: target/lib/dbgen.jar target/tpch-gen-1.0-SNAPSHOT.jar\n\ntarget/tpch-gen-1.0-SNAPSHOT.jar: $(shell find -name *.java) \n\tmvn package\n\ntarget/tpch_kit.zip: tpch_kit.zip\n\tmkdir -p target/\n\tcp tpch_kit.zip target/tpch_kit.zip\n\ntpch_kit.zip:\n\tcurl http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpch/README\n\tcurl --output tpch_kit.zip http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpch/tpch_kit.zip\n\ntarget/lib/dbgen.jar: target/tools/dbgen\n\tcd target/; mkdir -p lib/; ( jar cvf lib/dbgen.jar tools/ || gjar cvf lib/dbgen.jar tools/ )\n\ntarget/tools/dbgen: target/tpch_kit.zip\n\ttest -d target/tools/ || (cd target; unzip tpch_kit.zip -x __MACOSX/; ln -sf $$PWD/*/dbgen/ tools)\n\tcd target/tools; cat ../../../patches/${MYOS}/*.patch | patch -p0\n\tcd target/tools; make -f makefile.suite clean; make -f makefile.suite CC=gcc DATABASE=ORACLE MACHINE=LINUX WORKLOAD=TPCH\n\nclean:\n\tmvn clean\n"
  },
  {
    "path": "tpch-gen/README.md",
    "content": "Mapreduce TPC-H Generator\n=========================\n\nThis simplifies creating tpc-h data-sets on large scales on a hadoop cluster.\n\nTo get set up, you need to run\n\n\t$ make \n\nthis will download the TPC-h dbgen program, compile it and use maven to build the MR app wrapped around it.\n\nTo generate the data-sets, you need to run (say, for scale = 200, parallelism = 100)\n\n\t$ hadoop  jar target/tpch-gen-1.0-SNAPSHOT.jar   -d /user/hive/external/200/ -p 100 -s 200 \n\nThis uses the existing parallelism in the dbgen program without modification and uses it to run the command on multiple machines.\n\nThe command generates multiple files for each map task, resulting in each table having its own subdirectory.\n\nAssumptions made are that all machines in the cluster are OS/arch/lib identical.\n"
  },
  {
    "path": "tpch-gen/ddl/orc.sql",
    "content": "set hive.stats.autogather=true;\nset hive.stats.dbclass=fs;\n\ncreate table if not exists lineitem \n(L_ORDERKEY BIGINT,\n L_PARTKEY BIGINT,\n L_SUPPKEY BIGINT,\n L_LINENUMBER INT,\n L_QUANTITY DOUBLE,\n L_EXTENDEDPRICE DOUBLE,\n L_DISCOUNT DOUBLE,\n L_TAX DOUBLE,\n L_RETURNFLAG STRING,\n L_LINESTATUS STRING,\n L_SHIPDATE STRING,\n L_COMMITDATE STRING,\n L_RECEIPTDATE STRING,\n L_SHIPINSTRUCT STRING,\n L_SHIPMODE STRING,\n L_COMMENT STRING)\nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists part (P_PARTKEY INT,\n P_NAME STRING,\n P_MFGR STRING,\n P_BRAND STRING,\n P_TYPE STRING,\n P_SIZE INT,\n P_CONTAINER STRING,\n P_RETAILPRICE DOUBLE,\n P_COMMENT STRING) \nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists supplier (S_SUPPKEY BIGINT,\n S_NAME STRING,\n S_ADDRESS STRING,\n S_NATIONKEY INT,\n S_PHONE STRING,\n S_ACCTBAL DOUBLE,\n S_COMMENT STRING) \nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists partsupp (PS_PARTKEY BIGINT,\n PS_SUPPKEY BIGINT,\n PS_AVAILQTY INT,\n PS_SUPPLYCOST DOUBLE,\n PS_COMMENT STRING)\nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists nation (N_NATIONKEY INT,\n N_NAME STRING,\n N_REGIONKEY INT,\n N_COMMENT STRING)\nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists region (R_REGIONKEY INT,\n R_NAME STRING,\n R_COMMENT STRING)\nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists customer (C_CUSTKEY BIGINT,\n C_NAME STRING,\n C_ADDRESS STRING,\n C_NATIONKEY INT,\n C_PHONE STRING,\n C_ACCTBAL DOUBLE,\n C_MKTSEGMENT STRING,\n C_COMMENT STRING)\nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ncreate table if not exists orders (O_ORDERKEY BIGINT,\n O_CUSTKEY BIGINT,\n O_ORDERSTATUS STRING,\n O_TOTALPRICE DOUBLE,\n O_ORDERDATE STRING,\n O_ORDERPRIORITY STRING,\n O_CLERK STRING,\n O_SHIPPRIORITY INT,\n O_COMMENT STRING)\nSTORED AS ORC TBLPROPERTIES (\"orc.compress\"=\"SNAPPY\")\n;\n\ninsert overwrite table nation select * from ${SOURCE}.nation;\ninsert overwrite table region select * from ${SOURCE}.region;\ninsert overwrite table part select * from ${SOURCE}.part;\ninsert overwrite table supplier select * from ${SOURCE}.supplier;\ninsert overwrite table partsupp select * from ${SOURCE}.partsupp;\ninsert overwrite table customer select * from ${SOURCE}.customer;\ninsert overwrite table lineitem select * from ${SOURCE}.lineitem;\ninsert overwrite table orders select * from ${SOURCE}.orders;\n"
  },
  {
    "path": "tpch-gen/ddl/text.sql",
    "content": "create external table lineitem \n(L_ORDERKEY BIGINT,\n L_PARTKEY BIGINT,\n L_SUPPKEY BIGINT,\n L_LINENUMBER INT,\n L_QUANTITY DOUBLE,\n L_EXTENDEDPRICE DOUBLE,\n L_DISCOUNT DOUBLE,\n L_TAX DOUBLE,\n L_RETURNFLAG STRING,\n L_LINESTATUS STRING,\n L_SHIPDATE STRING,\n L_COMMITDATE STRING,\n L_RECEIPTDATE STRING,\n L_SHIPINSTRUCT STRING,\n L_SHIPMODE STRING,\n L_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE \nLOCATION '${LOCATION}/lineitem';\n\ncreate external table part (P_PARTKEY BIGINT,\n P_NAME STRING,\n P_MFGR STRING,\n P_BRAND STRING,\n P_TYPE STRING,\n P_SIZE INT,\n P_CONTAINER STRING,\n P_RETAILPRICE DOUBLE,\n P_COMMENT STRING) \nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE \nLOCATION '${LOCATION}/part/';\n\ncreate external table supplier (S_SUPPKEY BIGINT,\n S_NAME STRING,\n S_ADDRESS STRING,\n S_NATIONKEY INT,\n S_PHONE STRING,\n S_ACCTBAL DOUBLE,\n S_COMMENT STRING) \nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE \nLOCATION '${LOCATION}/supplier/';\n\ncreate external table partsupp (PS_PARTKEY BIGINT,\n PS_SUPPKEY BIGINT,\n PS_AVAILQTY INT,\n PS_SUPPLYCOST DOUBLE,\n PS_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION'${LOCATION}/partsupp';\n\ncreate external table nation (N_NATIONKEY INT,\n N_NAME STRING,\n N_REGIONKEY INT,\n N_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/nation';\n\ncreate external table region (R_REGIONKEY INT,\n R_NAME STRING,\n R_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/region';\n\ncreate external table customer (C_CUSTKEY BIGINT,\n C_NAME STRING,\n C_ADDRESS STRING,\n C_NATIONKEY INT,\n C_PHONE STRING,\n C_ACCTBAL DOUBLE,\n C_MKTSEGMENT STRING,\n C_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/customer';\n\ncreate external table orders (O_ORDERKEY BIGINT,\n O_CUSTKEY BIGINT,\n O_ORDERSTATUS STRING,\n O_TOTALPRICE DOUBLE,\n O_ORDERDATE STRING,\n O_ORDERPRIORITY STRING,\n O_CLERK STRING,\n O_SHIPPRIORITY INT,\n O_COMMENT STRING)\nROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE\nLOCATION '${LOCATION}/orders';\n"
  },
  {
    "path": "tpch-gen/patches/Darwin/macosx.patch",
    "content": "--- makefile.suite.orig\t2014-06-25 15:40:27.000000000 -0700\n+++ makefile.suite\t2014-06-25 15:42:03.000000000 -0700\n@@ -110,7 +110,7 @@\n MACHINE = \n WORKLOAD = \n #\n-CFLAGS\t= -g -DDBNAME=\\\"dss\\\" -D$(MACHINE) -D$(DATABASE) -D$(WORKLOAD) -DRNG_TEST -D_FILE_OFFSET_BITS=64 \n+CFLAGS\t= -g -DDBNAME=\\\"dss\\\" -D$(MACHINE) -D$(DATABASE) -D$(WORKLOAD) -DRNG_TEST -D_FILE_OFFSET_BITS=64  -I/usr/include/malloc\n LDFLAGS = -O\n # The OBJ,EXE and LIB macros will need to be changed for compilation under\n #  Windows NT\n"
  },
  {
    "path": "tpch-gen/pom.xml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n<project xmlns=\"http://maven.apache.org/POM/4.0.0\"\n    xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\n    xsi:schemaLocation=\"http://maven.apache.org/POM/4.0.0\n                        http://maven.apache.org/maven-v4_0_0.xsd\">\n\n  <modelVersion>4.0.0</modelVersion>\n\n  <groupId>org.notmysock.tpch</groupId>\n  <artifactId>tpch-gen</artifactId>\n  <version>1.0-SNAPSHOT</version>\n  <packaging>jar</packaging>\n\n  <name>tpch-gen</name>\n  <url>http://maven.apache.org</url>\n\n  <dependencies>\n    <dependency>\n      <groupId>org.apache.hadoop</groupId>\n      <artifactId>hadoop-client</artifactId>\n      <version>3.1.1</version>\n      <scope>compile</scope>\n    </dependency>\n    <dependency>\n      <groupId>commons-cli</groupId>\n      <artifactId>commons-cli</artifactId>\n      <version>1.1</version>\n      <scope>compile</scope>\n    </dependency>\n    <dependency>\n      <groupId>org.mockito</groupId>\n      <artifactId>mockito-core</artifactId>\n      <version>1.8.5</version>\n      <scope>test</scope>\n    </dependency>\n    <dependency>\n      <groupId>junit</groupId>\n      <artifactId>junit</artifactId>\n      <version>4.7</version>\n      <scope>test</scope>\n    </dependency>\n  </dependencies>\n\n  <build>\n    <plugins>\n      <plugin>\n        <artifactId>maven-compiler-plugin</artifactId>\n        <configuration>\n          <source>1.6</source>\n          <target>1.6</target>\n        </configuration>\n      </plugin>\n      <plugin>\n        <groupId>org.apache.maven.plugins</groupId>\n        <artifactId>maven-jar-plugin</artifactId>\n        <configuration>\n          <archive>\n            <manifest>\n              <addClasspath>true</addClasspath>\n\t\t\t  <classpathPrefix>lib/</classpathPrefix>\n              <mainClass>org.notmysock.tpch.GenTable</mainClass>\n            </manifest>\n          </archive>\n        </configuration>\n      </plugin>\n\t  <plugin>\n\t\t<groupId>org.apache.maven.plugins</groupId>\n\t\t<artifactId>maven-dependency-plugin</artifactId>\n        <executions>\n\t\t  <execution>\n            <id>copy-dependencies</id>\n            <phase>package</phase>\n            <goals>\n\t\t\t  <goal>copy-dependencies</goal>\n            </goals>\n            <configuration>\n                <outputDirectory>${project.build.directory}/lib</outputDirectory>\n            </configuration>\n          </execution>\n        </executions>\n      </plugin>\n    </plugins>\n  </build>\n\t\n  <pluginRepositories>\n    <pluginRepository>\n      <id>central</id>\n        <name>Central Repository</name>\n        <url>https://repo.maven.apache.org/maven2</url>\n        <layout>default</layout>\n        <snapshots>\n          <enabled>false</enabled>\n        </snapshots>\n        <releases>\n          <updatePolicy>never</updatePolicy>\n        </releases>\n    </pluginRepository>\n  </pluginRepositories>\n  <repositories>\n    <repository>\n      <id>central</id>\n      <name>Central Repository</name>\n      <url>https://repo.maven.apache.org/maven2</url>\n      <layout>default</layout>\n      <snapshots>\n        <enabled>false</enabled>\n      </snapshots>\n    </repository>\n  </repositories>\n\n</project>\n"
  },
  {
    "path": "tpch-gen/src/main/java/org/notmysock/tpch/GenTable.java",
    "content": "package org.notmysock.tpch;\n\nimport org.apache.hadoop.conf.*;\nimport org.apache.hadoop.fs.*;\nimport org.apache.hadoop.hdfs.*;\nimport org.apache.hadoop.io.*;\nimport org.apache.hadoop.io.compress.DefaultCodec;\nimport org.apache.hadoop.io.compress.SnappyCodec;\nimport org.apache.hadoop.util.*;\nimport org.apache.hadoop.filecache.*;\nimport org.apache.hadoop.mapreduce.*;\nimport org.apache.hadoop.mapreduce.lib.input.*;\nimport org.apache.hadoop.mapreduce.lib.output.*;\nimport org.apache.hadoop.mapreduce.lib.reduce.*;\nimport org.apache.commons.cli.*;\nimport org.apache.commons.*;\n\nimport java.io.*;\nimport java.nio.*;\nimport java.util.*;\nimport java.net.*;\nimport java.math.*;\nimport java.security.*;\n\n\npublic class GenTable extends Configured implements Tool {\n\t\n\tprivate static enum TableMappings {\n\t\tALL(\"all\"),\n\t\tCUSTOMERS(\"c\"),\n\t\tSUPPLIERS(\"s\"),\n\t\tNATION(\"l\"),\n\t\tORDERS(\"o\"),\n\t\tPARTS(\"p\");\n\t\t\n\t\t/*\n\t\t-T c   -- generate cutomers ONLY\n\t\t-T l   -- generate nation/region ONLY\n\t\t-T o   -- generate orders/lineitem ONLY\n\t\t-T p   -- generate parts/partsupp ONLY\n\t\t-T s   -- generate suppliers ONLY\n\t\t*/\n\t\t\n\t\t\n\t\tfinal String option;\n\t\t\n\t\tTableMappings(String option) {\n\t\t\tthis.option = option;\n\t\t}\n\t}\n\t\n    public static void main(String[] args) throws Exception {\n        Configuration conf = new Configuration();\n        int res = ToolRunner.run(conf, new GenTable(), args);\n        System.exit(res);\n    }\n\n    @Override\n    public int run(String[] args) throws Exception {\n        String[] remainingArgs = new GenericOptionsParser(getConf(), args).getRemainingArgs();\n\n        CommandLineParser parser = new BasicParser();\n        getConf().setInt(\"io.sort.mb\", 4);\n        org.apache.commons.cli.Options options = new org.apache.commons.cli.Options();\n        options.addOption(\"s\",\"scale\", true, \"scale\");\n        options.addOption(\"t\",\"table\", true, \"table\");\n        options.addOption(\"d\",\"dir\", true, \"dir\");\n        options.addOption(\"p\", \"parallel\", true, \"parallel\");\n        options.addOption(\"text\", \"text\", false, \"text\");\n        options.addOption(\"snappy\", \"snappy\", false, \"snappy\");\n        CommandLine line = parser.parse(options, remainingArgs);\n\n        if(!(line.hasOption(\"scale\") && line.hasOption(\"dir\"))) {\n          HelpFormatter f = new HelpFormatter();\n          f.printHelp(\"GenTable\", options);\n          return 1;\n        }\n        \n        int scale = Integer.parseInt(line.getOptionValue(\"scale\"));\n        String table = \"all\";\n        if(line.hasOption(\"table\")) {\n          table = line.getOptionValue(\"table\");\n          table = TableMappings.valueOf(table.toUpperCase()).option;\n        }\n        Path out = new Path(line.getOptionValue(\"dir\"));\n\n        int parallel = scale;\n\n        if(line.hasOption(\"parallel\")) {\n          parallel = Integer.parseInt(line.getOptionValue(\"parallel\"));\n        }\n\n        if(parallel == 1 || scale == 1) {\n          System.err.println(\"The MR task does not work for scale=1 or parallel=1\");\n          return 1;\n        }\n\n        Path in = genInput(table, scale, parallel);\n\n        Path dbgen = copyJar(new File(\"target/lib/dbgen.jar\"));\n        URI dsuri = dbgen.toUri();\n        URI link = new URI(dsuri.getScheme(),\n                    dsuri.getUserInfo(), dsuri.getHost(), \n                    dsuri.getPort(),dsuri.getPath(), \n                    dsuri.getQuery(),\"dbgen\");\n        Configuration conf = getConf();\n        conf.setInt(\"mapred.task.timeout\",0);\n        conf.setInt(\"mapreduce.task.timeout\",0);\n        DistributedCache.addCacheArchive(link, conf);\n        Job job = new Job(conf, \"GenTable+\"+table+\"_\"+scale);\n        job.setJarByClass(getClass());\n        job.setNumReduceTasks(0);\n        job.setMapperClass(dbgen.class);\n        job.setOutputKeyClass(Text.class);\n        job.setOutputValueClass(Text.class);\n\n        job.setInputFormatClass(NLineInputFormat.class);\n        NLineInputFormat.setNumLinesPerSplit(job, 1);\n\n        FileInputFormat.addInputPath(job, in);\n        FileOutputFormat.setOutputPath(job, out);\n\n        // use multiple output to only write the named files\n        LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);\n        MultipleOutputs.addNamedOutput(job, \"text\", \n          TextOutputFormat.class, LongWritable.class, Text.class);\n\n        if (line.hasOption(\"snappy\") || (line.hasOption(\"text\") == false)) {\n          TextOutputFormat.setCompressOutput(job, true);\n          if (line.hasOption(\"snappy\")) {\n             TextOutputFormat.setOutputCompressorClass(job, SnappyCodec.class);\n          } else {\n             TextOutputFormat.setOutputCompressorClass(job, DefaultCodec.class);\n          }\n        }\n\n        boolean success = job.waitForCompletion(true);\n\n        // cleanup\n        FileSystem fs = FileSystem.get(getConf());\n        \n        fs.delete(in, false);\n        fs.delete(dbgen, false);\n\n        return 0;\n    }\n\n    public Path copyJar(File jar) throws Exception {\n      MessageDigest md = MessageDigest.getInstance(\"MD5\");\n      InputStream is = new FileInputStream(jar);\n      try {\n        is = new DigestInputStream(is, md);\n        // read stream to EOF as normal...\n      }\n      finally {\n        is.close();\n      }\n      BigInteger md5 = new BigInteger(md.digest()); \n      String md5hex = md5.toString(16);\n      Path dst = new Path(String.format(\"/tmp/%s.jar\",md5hex));\n      Path src = new Path(jar.toURI());\n      FileSystem fs = FileSystem.get(getConf());\n      fs.copyFromLocalFile(false, /*overwrite*/true, src, dst);\n      return dst; \n    }\n\n    public Path genInput(String table, int scale, int parallel) throws Exception {\n        long epoch = System.currentTimeMillis()/1000;\n\n        Path in = new Path(\"/tmp/\"+table+\"_\"+scale+\"-\"+epoch);\n        FileSystem fs = FileSystem.get(getConf());\n        FSDataOutputStream out = fs.create(in);\n        for(int i = 1; i <= parallel; i++) {\n          if(table.equals(\"all\")) {\n            out.writeBytes(String.format(\"$DIR/dbgen/tools/dbgen -b $DIR/dbgen/tools/dists.dss -f -s %d -C %d -S %d\\n\", scale, parallel, i));\n          } else {\n        \tout.writeBytes(String.format(\"$DIR/dbgen/tools/dbgen -b $DIR/dbgen/tools/dists.dss -f -s %d -C %d -S %d -T %s\\n\", scale, parallel, i, table));           \n          }\n        }\n        out.close();\n        return in;\n    }\n\n    static String readToString(InputStream in) throws IOException {\n      InputStreamReader is = new InputStreamReader(in);\n      StringBuilder sb=new StringBuilder();\n      BufferedReader br = new BufferedReader(is);\n      String read = br.readLine();\n\n      while(read != null) {\n        //System.out.println(read);\n        sb.append(read);\n        read =br.readLine();\n      }\n      return sb.toString();\n    }\n\n    static final class dbgen extends Mapper<LongWritable,Text, Text, Text> {\n      private MultipleOutputs mos;\n      protected void setup(Context context) throws IOException {\n        mos = new MultipleOutputs(context);\n      }\n      protected void cleanup(Context context) throws IOException, InterruptedException {\n        mos.close();\n      }\n      protected void map(LongWritable offset, Text command, Mapper.Context context) \n        throws IOException, InterruptedException {\n        String parallel=\"1\";\n        String child=\"1\";\n\n        String[] cmd = command.toString().split(\" \");\n\n        for(int i=0; i<cmd.length; i++) {\n          if(cmd[i].contains(\"$DIR\")) {\n            cmd[i] = cmd[i].replace(\"$DIR\",(new File(\".\")).getAbsolutePath());\n          }\n          if(cmd[i].equals(\"-C\")) {\n            parallel = cmd[i+1];\n          }\n          if(cmd[i].equals(\"-S\")) {\n            child = cmd[i+1];\n          }\n        }\n\n        Process p = Runtime.getRuntime().exec(cmd, null, new File(\".\"));\n        int status = p.waitFor();\n        if(status != 0) {\n          String err = readToString(p.getErrorStream());\n          throw new InterruptedException(\"Process failed with status code \" + status + \"\\n\" + err);\n        }\n\n        File cwd = new File(\".\");\n        final String suffix = String.format(\".tbl.%s\", child);\n        final boolean firstMapper = child.equals(\"1\");\n\n        FilenameFilter tables = new FilenameFilter() {\n          public boolean accept(File dir, String name) {\n            return name.endsWith(suffix) || (name.endsWith(\".tbl\") && firstMapper);\n          }\n        };\n\n        for(File f: cwd.listFiles(tables)) {\n          BufferedReader br = new BufferedReader(new FileReader(f));          \n          String line;\n          String name = f.getName().replace(suffix,\"/data\").replace(\".tbl\", \"/data\");\n          while ((line = br.readLine()) != null) {\n            // process the line.\n            mos.write(\"text\", line, null, name);\n          }\n          br.close();\n          f.deleteOnExit();\n        }\n      }\n    }\n}\n"
  },
  {
    "path": "tpch-setup.sh",
    "content": "#!/bin/bash\n\nfunction usage {\n\techo \"Usage: tpch-setup.sh scale_factor [temp_directory]\"\n\texit 1\n}\n\nfunction runcommand {\n\tif [ \"X$DEBUG_SCRIPT\" != \"X\" ]; then\n\t\t$1\n\telse\n\t\t$1 2>/dev/null\n\tfi\n}\n\nif [ ! -f tpch-gen/target/tpch-gen-1.0-SNAPSHOT.jar ]; then\n\techo \"Please build the data generator with ./tpch-build.sh first\"\n\texit 1\nfi\nwhich hive > /dev/null 2>&1\nif [ $? -ne 0 ]; then\n\techo \"Script must be run where Hive is installed\"\n\texit 1\nfi\n\n# Tables in the TPC-H schema.\nTABLES=\"part partsupp supplier customer orders lineitem nation region\"\n\n# Get the parameters.\nSCALE=$1\nDIR=$2\nBUCKETS=13\nif [ \"X$DEBUG_SCRIPT\" != \"X\" ]; then\n\tset -x\nfi\n\n# Sanity checking.\nif [ X\"$SCALE\" = \"X\" ]; then\n\tusage\nfi\nif [ X\"$DIR\" = \"X\" ]; then\n\tDIR=/tmp/tpch-generate\nfi\nif [ $SCALE -eq 1 ]; then\n\techo \"Scale factor must be greater than 1\"\n\texit 1\nfi\n\n# Do the actual data load.\nhdfs dfs -mkdir -p ${DIR}\nhdfs dfs -ls ${DIR}/${SCALE}/lineitem > /dev/null\nif [ $? -ne 0 ]; then\n\techo \"Generating data at scale factor $SCALE.\"\n\t(cd tpch-gen; hadoop jar target/*.jar -d ${DIR}/${SCALE}/ -s ${SCALE})\nfi\nhdfs dfs -ls ${DIR}/${SCALE}/lineitem > /dev/null\nif [ $? -ne 0 ]; then\n\techo \"Data generation failed, exiting.\"\n\texit 1\nfi\necho \"TPC-H text data generation complete.\"\n\n# Create the text/flat tables as external tables. These will be later be converted to ORCFile.\necho \"Loading text data into external tables.\"\nruncommand \"hive -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_${SCALE} -d LOCATION=${DIR}/${SCALE}\"\n\n# Create the optimized tables.\ni=1\ntotal=8\n\nif test $SCALE -le 1000; then \n\tSCHEMA_TYPE=flat\nelse\n\tSCHEMA_TYPE=partitioned\nfi\n\nDATABASE=tpch_${SCHEMA_TYPE}_orc_${SCALE}\nMAX_REDUCERS=2600 # ~7 years of data\nREDUCERS=$((test ${SCALE} -gt ${MAX_REDUCERS} && echo ${MAX_REDUCERS}) || echo ${SCALE})\n\nfor t in ${TABLES}\ndo\n\techo \"Optimizing table $t ($i/$total).\"\n\tCOMMAND=\"hive -i settings/load-${SCHEMA_TYPE}.sql -f ddl-tpch/bin_${SCHEMA_TYPE}/${t}.sql \\\n\t    -d DB=${DATABASE} \\\n\t    -d SOURCE=tpch_text_${SCALE} -d BUCKETS=${BUCKETS} \\\n            -d SCALE=${SCALE} -d REDUCERS=${REDUCERS} \\\n\t    -d FILE=orc\"\n\truncommand \"$COMMAND\"\n\tif [ $? -ne 0 ]; then\n\t\techo \"Command failed, try 'export DEBUG_SCRIPT=ON' and re-running\"\n\t\texit 1\n\tfi\n\ti=`expr $i + 1`\ndone\n\nhive -i settings/load-${SCHEMA_TYPE}.sql -f ddl-tpch/bin_${SCHEMA_TYPE}/analyze.sql --database ${DATABASE}; \n\necho \"Data loaded into database ${DATABASE}.\"\n"
  }
]