[
  {
    "path": ".gitattributes",
    "content": "# Auto detect text files and perform LF normalization\n* text=auto\n\n# Custom for Visual Studio\n*.cs     diff=csharp\n\n# Standard to msysgit\n*.doc\t diff=astextplain\n*.DOC\t diff=astextplain\n*.docx diff=astextplain\n*.DOCX diff=astextplain\n*.dot  diff=astextplain\n*.DOT  diff=astextplain\n*.pdf  diff=astextplain\n*.PDF\t diff=astextplain\n*.rtf\t diff=astextplain\n*.RTF\t diff=astextplain\n"
  },
  {
    "path": ".gitignore",
    "content": "IASP520 Project(1).docx\nRe_ Final Project.zip\nRe_ Final Project/GetNewsPolarity.py\nRe_ Final Project/GYN.json\nRe_ Final Project/IASP520_SVM_V2_With_Polarity.py\nRe_ Final Project/IASP520_SVM_V2_With_Polarity_Json.py\nRe_ Final Project/NQN.json\nRe_ Final Project/scoreSentiment.R\nRe_ Final Project/scoreSentimentYahooNews.R\nRe_ Final Project/Write_Polarity_To_File.py\nRe_ Final Project/__pycache__/GetNewsPolarity.cpython-35.pyc\nData Mining Stock_V1.02.pptx\nIASP520 Project.docx"
  },
  {
    "path": "GYN.json",
    "content": "{\"2015-01-27\": {\"Polarity\": 0.8888888888888888, \"Date\": \"2015-01-27\", \"Epoch\": \"1422334800\"}, \"2015-02-13\": {\"Polarity\": -0.3333333333333333, \"Date\": \"2015-02-13\", \"Epoch\": \"1423803600\"}, \"2015-03-19\": {\"Polarity\": 1.0, \"Date\": \"2015-03-19\", \"Epoch\": \"1426737600\"}, \"2015-04-21\": {\"Polarity\": 1.0, \"Date\": \"2015-04-21\", \"Epoch\": \"1429588800\"}, \"2015-05-27\": {\"Polarity\": 1.0, \"Date\": \"2015-05-27\", \"Epoch\": \"1432699200\"}, \"2015-06-18\": {\"Polarity\": 1.0, \"Date\": \"2015-06-18\", \"Epoch\": \"1434600000\"}, \"2015-07-09\": {\"Polarity\": 1.0, \"Date\": \"2015-07-09\", \"Epoch\": \"1436414400\"}, \"2015-07-17\": {\"Polarity\": 1.0, \"Date\": \"2015-07-17\", \"Epoch\": \"1437105600\"}, \"2015-07-27\": {\"Polarity\": 0.4, \"Date\": \"2015-07-27\", \"Epoch\": \"1437969600\"}, \"2015-07-31\": {\"Polarity\": 1.0, \"Date\": \"2015-07-31\", \"Epoch\": \"1438315200\"}, \"2015-09-02\": {\"Polarity\": 1.0, \"Date\": \"2015-09-02\", \"Epoch\": \"1441166400\"}, \"2015-10-20\": {\"Polarity\": 1.0, \"Date\": \"2015-10-20\", \"Epoch\": \"1445313600\"}, \"2015-11-23\": {\"Polarity\": 1.0, \"Date\": \"2015-11-23\", \"Epoch\": \"1448254800\"}, \"2015-12-16\": {\"Polarity\": 0.2, \"Date\": \"2015-12-16\", \"Epoch\": \"1450242000\"}, \"2016-01-07\": {\"Polarity\": -1.0, \"Date\": \"2016-01-07\", \"Epoch\": \"1452142800\"}, \"2016-02-01\": {\"Polarity\": 0.0, \"Date\": \"2016-02-01\", \"Epoch\": \"1454302800\"}, \"2016-02-02\": {\"Polarity\": 1.0, \"Date\": \"2016-02-02\", \"Epoch\": \"1454389200\"}, \"2016-02-03\": {\"Polarity\": -0.16666666666666666, \"Date\": \"2016-02-03\", \"Epoch\": \"1454475600\"}, \"2016-02-10\": {\"Polarity\": 0.42857142857142855, \"Date\": \"2016-02-10\", \"Epoch\": \"1455080400\"}, \"2016-02-18\": {\"Polarity\": 0.0, \"Date\": \"2016-02-18\", \"Epoch\": \"1455771600\"}, \"2016-04-27\": {\"Polarity\": 0.4, \"Date\": \"2016-04-27\", \"Epoch\": \"1461729600\"}, \"2016-05-05\": {\"Polarity\": 0.2222222222222222, \"Date\": \"2016-05-05\", \"Epoch\": \"1462420800\"}, \"2016-05-16\": {\"Polarity\": 1.0, \"Date\": \"2016-05-16\", \"Epoch\": \"1463371200\"}, \"2016-05-20\": {\"Polarity\": 0.6666666666666666, \"Date\": \"2016-05-20\", \"Epoch\": \"1463716800\"}, \"2016-05-25\": {\"Polarity\": 1.0, \"Date\": \"2016-05-25\", \"Epoch\": \"1464148800\"}, \"2016-05-31\": {\"Polarity\": 1.0, \"Date\": \"2016-05-31\", \"Epoch\": \"1464667200\"}, \"2016-06-07\": {\"Polarity\": 0.6, \"Date\": \"2016-06-07\", \"Epoch\": \"1465272000\"}, \"2016-06-08\": {\"Polarity\": 1.0, \"Date\": \"2016-06-08\", \"Epoch\": \"1465358400\"}, \"2016-06-09\": {\"Polarity\": 0.3333333333333333, \"Date\": \"2016-06-09\", \"Epoch\": \"1465444800\"}, \"2016-07-05\": {\"Polarity\": 1.0, \"Date\": \"2016-07-05\", \"Epoch\": \"1467691200\"}, \"2016-07-07\": {\"Polarity\": 1.0, \"Date\": \"2016-07-07\", \"Epoch\": \"1467864000\"}, \"2016-07-18\": {\"Polarity\": 0.3137254901960784, \"Date\": \"2016-07-18\", \"Epoch\": \"1468814400\"}, \"2016-07-19\": {\"Polarity\": 1.0, \"Date\": \"2016-07-19\", \"Epoch\": \"1468900800\"}, \"2016-07-22\": {\"Polarity\": 1.0, \"Date\": \"2016-07-22\", \"Epoch\": \"1469160000\"}, \"2016-07-25\": {\"Polarity\": 0.6723809523809524, \"Date\": \"2016-07-25\", \"Epoch\": \"1469419200\"}, \"2016-08-08\": {\"Polarity\": 0.0, \"Date\": \"2016-08-08\", \"Epoch\": \"1470628800\"}, \"2016-08-26\": {\"Polarity\": 1.0, \"Date\": \"2016-08-26\", \"Epoch\": \"1472184000\"}, \"2016-08-31\": {\"Polarity\": 1.0, \"Date\": \"2016-08-31\", \"Epoch\": \"1472616000\"}, \"2016-09-22\": {\"Polarity\": 0.14285714285714285, \"Date\": \"2016-09-22\", \"Epoch\": \"1474516800\"}, \"2016-09-23\": {\"Polarity\": -0.3333333333333333, \"Date\": \"2016-09-23\", \"Epoch\": \"1474603200\"}, \"2016-10-06\": {\"Polarity\": 1.0, \"Date\": \"2016-10-06\", \"Epoch\": \"1475726400\"}, \"2016-10-18\": {\"Polarity\": 0.14285714285714285, \"Date\": \"2016-10-18\", \"Epoch\": \"1476763200\"}, \"2016-10-19\": {\"Polarity\": 0.14285714285714285, \"Date\": \"2016-10-19\", \"Epoch\": \"1476849600\"}, \"2016-10-27\": {\"Polarity\": 0.14285714285714285, \"Date\": \"2016-10-27\", \"Epoch\": \"1477540800\"}, \"2016-11-22\": {\"Polarity\": 0.5384615384615384, \"Date\": \"2016-11-22\", \"Epoch\": \"1479790800\"}, \"2016-11-26\": {\"Polarity\": 0.5172413793103449, \"Date\": \"2016-11-26\", \"Epoch\": \"1480136400\"}, \"2016-11-29\": {\"Polarity\": -0.16013071895424838, \"Date\": \"2016-11-29\", \"Epoch\": \"1480395600\"}, \"2016-11-30\": {\"Polarity\": 0.3076923076923077, \"Date\": \"2016-11-30\", \"Epoch\": \"1480482000\"}, \"2016-12-02\": {\"Polarity\": 0.125, \"Date\": \"2016-12-02\", \"Epoch\": \"1480654800\"}, \"2016-12-05\": {\"Polarity\": 0.41435185185185186, \"Date\": \"2016-12-05\", \"Epoch\": \"1480914000\"}, \"2016-12-08\": {\"Polarity\": 0.6153846153846154, \"Date\": \"2016-12-08\", \"Epoch\": \"1481173200\"}, \"2016-12-09\": {\"Polarity\": 0.4495726495726496, \"Date\": \"2016-12-09\", \"Epoch\": \"1481259600\"}, \"2016-12-11\": {\"Polarity\": 0.0, \"Date\": \"2016-12-11\", \"Epoch\": \"1481432400\"}, \"2016-12-12\": {\"Polarity\": 0.5454406704406705, \"Date\": \"2016-12-12\", \"Epoch\": \"1481518800\"}, \"2016-12-14\": {\"Polarity\": 0.32844490453186104, \"Date\": \"2016-12-14\", \"Epoch\": \"1481691600\"}, \"2016-12-15\": {\"Polarity\": 0.26835749954655724, \"Date\": \"2016-12-15\", \"Epoch\": \"1481778000\"}, \"2016-12-17\": {\"Polarity\": 0.46338743802957916, \"Date\": \"2016-12-17\", \"Epoch\": \"1481950800\"}, \"2016-12-18\": {\"Polarity\": 0.6180405242905243, \"Date\": \"2016-12-18\", \"Epoch\": \"1482037200\"}}"
  },
  {
    "path": "IASP520_ARMA_V1.01.py",
    "content": "from yahoo_finance import Share\nfrom pandas import Series,DataFrame\nimport pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn import svm\nimport statsmodels.api as sm\nfrom statsmodels.graphics.api import qqplot\nimport statsmodels.tsa.stattools as ts\nfrom scipy import  stats\nimport pywt\n\nyaho = Share('YHOO') #choose stock, YAHOO, GOLD\nstartday='2015-11-1' #choose first day\nendday='2016-12-15' #choose end day\n#train = 15 #How many data for train, 9 is the least.\n\n#draw\nfig=plt.figure()\nax1=fig.add_subplot(711)\nax2=fig.add_subplot(712)\nax3=fig.add_subplot(713)\nax4=fig.add_subplot(714)\nax5=fig.add_subplot(715)\nax6=fig.add_subplot(716)\nax7=fig.add_subplot(717)\n\n#Data processing\nStockDate = DataFrame(yaho.get_historical(startday, endday))\nStockDate.index = StockDate.Date\nStockDate = DataFrame.sort_index(StockDate) #sort\n\ntest = DataFrame(yaho.get_historical(startday, '2016-12-1'))\ntest.index = test.Date\ntest = DataFrame.sort_index(test)\ntest = test['Close']\ntest=test.astype(float)\ntest.plot(ax=ax5)\n\n#L=len(StockDate)\n#total_predict_data=L-train\n\n'''\n#draw\nData = StockDate.drop(['Date','Symbol','Adj_Close'],axis=1) \nData=Data.astype(float)\nax=Data.plot(secondary_y=['Volume'])\nax.set_ylabel('Value')\nax.right_ax.set_ylabel('Volume')\nplt.grid(True)\nplt.show()\n#Create more data\nvalue = pd.Series(Data['Open']-Data['Close'],index=Data.index)\n#Data['DOP'] = value #Difference between Open and Close\n#Data['DHL'] = Data['High']-Data['Low'] #Difference between High and Low\nvalue[value>=0]=0 #0 means fall\nvalue[value<0]=1 #1 means rise\nprint(value)\n'''\n#ARIMA\nClose_original = StockDate['Close']\nClose_original=Close_original.astype(float)\n#Close.plot()\nclose=pywt.dwt(Close_original, 'db4') #DB4,Wavelet decomposition\nClose_db4=pd.Series(close[0])\nClose_db4=Close_db4-14\nClose_db4.index = pd.Index(sm.tsa.datetools.dates_from_range('2001','2145'))\n#aa=Close.diff(3)\n\n#draw\n#aa.plot(ax=ax4)\nClose_db4.plot(ax=ax2)\nClose=Close_db4.diff(4) #stationary time series\nClose=Close[4:]\n\nprint(\"Augmented Dickey-Fuller test:\",ts.adfuller(Close,4)) #Augmented Dickey-Fuller test\nClose.plot(ax=ax3)\nClose_original.plot(ax=ax1)\n\nsm.graphics.tsa.plot_acf(Close,lags=40,ax=ax6) #ARIMA,q\nsm.graphics.tsa.plot_pacf(Close,lags=40,ax=ax7) #ARIMA,p\n#print(Close)\n\nArma = sm.tsa.ARMA(Close,order=(9,3)).fit(disp=-1, method='mle')\nprint(Arma.aic,Arma.bic,Arma.hqic)\n\n#predict\nArma_stock=Arma.predict()\nArma_stock.plot(ax=ax3)\npredict_stock = Arma.predict('2137','2148',dynamic=True)\npredict_stock.plot(ax=ax3)\n\n#reduce diff()\nL=len(Arma_stock)\ni=0\nwhile i<L:\n\tif(i<4):\n\t\tArma_stock[i]=Arma_stock[i]+Close_db4[i]\n\telse:\n\t\tArma_stock[i] = Arma_stock[i]+Arma_stock[i-4]\n\ti=i+1\nArma_stock.plot(ax=ax4)\nL=len(predict_stock)\ni=0\nwhile i<L:\n\tif(i<4):\n\t\tpredict_stock[i] = predict_stock[i]+Arma_stock[-4+i]\n\telse:\n\t\tpredict_stock[i] = predict_stock[i]+predict_stock[i-4]\n\ti=i+1\t\npredict_stock.plot(ax=ax4)\n\nplt.grid(True)\nplt.show()\n\n'''\n#Data['Value']=value\n#SVM\ncorrect = 0\ntrain_original=train\nwhile train<L:\n\tData_train=Data[train-train_original:train]\n\tvalue_train = value[train-train_original:train]\n\tData_predict=Data[train:train+1]\n\tvalue_real = value[train:train+1]\n\t#print(Data_train)\n\t#print(value_train)\n\tprint(train)\n\t#classifier =svm.SVC(kernel='poly') #52% need optimization, some data may expand to infinite demension\n\t#classifier =svm.SVC(kernel='sigmoid') #49%\n\t#classifier =svm.SVC(kernel='precomputed') #bug\n\t#classifier =svm.SVC() #kernel='rbf'\n\tclassifier =svm.LinearSVC() #53%\n\tclassifier.fit(Data_train,value_train)\n\tprint(train)\n\tvalue_predict=classifier.predict(Data_predict)\n\tprint(train)\n\t#print(\"value_real = \",value_real[0])\n\t#print(\"value_predict = \",value_predict)\n\tif(value_real[0]==int(value_predict)):\n\t\tcorrect=correct+1\n\t\tprint(correct)\n\ttrain = train+1\ncorrect=correct/total_predict_data*100\nprint(\"Correct =\",correct,\"%\")\n\n#test SVM\nprint(\"support_:\",classifier.support_)\nprint(\"support_vectors_:\",classifier.support_vectors_)\nprint(\"n_support_:\",classifier.n_support_)\n'''\n\n"
  },
  {
    "path": "IASP520_SVM.py",
    "content": "from yahoo_finance import Share\nfrom pandas import Series,DataFrame\nimport pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn import svm\n\nyaho = Share('YHOO') #choose stock\nstartday='2015-11-1' #choose first day\nendday='2016-11-23' #choose end day\ntrain = 15 #How many data for train, 9 is the least.\n\n#Data processing\nStockDate = DataFrame(yaho.get_historical(startday, endday))\nStockDate.index = StockDate.Date\nStockDate = DataFrame.sort_index(StockDate) #sort\n\nL=len(StockDate)\ntotal_predict_data=L-train\n\n#draw\nData = StockDate.drop(['Date','Symbol','Adj_Close'],axis=1) \nData=Data.astype(float)\nax=Data.plot(secondary_y=['Volume'])\nax.set_ylabel('Value')\nax.right_ax.set_ylabel('Volume')\n#plt.grid(True)\n#plt.show()\n\n#Create more data\nvalue = pd.Series(Data['Open'].shift(-1)-Data['Close'].shift(-1),index=Data.index)\nData['Next_Open'] = Data['Open'].shift(-1) #Next day's Open data.\n#Data['DHL'] = Data['High']-Data['Low'] #Difference between High and Low\nvalue[value>=0]=0 #0 means fall\nvalue[value<0]=1 #1 means rise\nprint(value)\n\n\n#Data['Value']=value\ncorrect = 0\ntrain_original=train\ni=0\n'''\n#Classical classification, normal way\nData_train=Data[0:L-20]\nvalue_train = value[0:L-20]\nData_predict=Data[L-20:L]\nvalue_real = value[L-20:L]\nprint(Data_predict)\nclassifier = svm.SVC()\nclassifier.fit(Data_train,value_train)\nvalue_predict=classifier.predict(Data_predict)\nprint(\"value_real = \",value_predict)\nwhile i<19:\n\tprint(\"value_real = \",value_real[i])\n\tif(value_real[i]==int(value_predict[i])):\n\t\tcorrect=correct+1\n\ti+=1\nprint(\"Correct = \",correct/19*100,\"%\")\n'''\n#loop training,15 days data for train\nwhile train<L-1:\n\tData_train=Data[train-train_original:train]\n\tvalue_train = value[train-train_original:train]\n\tData_predict=Data[train:train+1]\n\tvalue_real = value[train:train+1]\n\t#print(Data_train)\n\t#print(value_train)\n\n\tclassifier = svm.SVC(kernel='poly')#kernel='linear')\n\tclassifier.fit(Data_train,value_train)\n\tvalue_predict=classifier.predict(Data_predict)\n\t#print(\"value_real = \",value_real[0])\n\t#print(\"value_predict = \",value_predict)\n\tif(value_real[0]==int(value_predict)):\n\t\tcorrect=correct+1\n\ttrain = train+1\ncorrect=correct/total_predict_data*100\nprint(\"Correct = \",correct,\"%\")\n\n'''\nprint(\"support_:\",classifier.support_)\nprint(\"support_vectors_:\",classifier.support_vectors_)\nprint(\"n_support_:\",classifier.n_support_)\n'''\n\n\n\n\n\n"
  },
  {
    "path": "IASP520_SVM_V2.py",
    "content": "#V2.0 \n#change all the data, now it's base on relation not price. \n#Increase the training Data set.\nfrom yahoo_finance import Share\nfrom pandas import Series,DataFrame\nimport pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn import svm\n\nyaho = Share('YHOO') #choose stock\nstartday='2015-11-1' #choose first day\nendday='2016-11-23' #choose end day\ntrain = 70 #How many data for train, 9 is the least.\n\n#Data processing\nStockDate = DataFrame(yaho.get_historical(startday, endday))\nStockDate.index = StockDate.Date\nStockDate = DataFrame.sort_index(StockDate) #sort\n\nL=len(StockDate)\ntotal_predict_data=L-train\n\n#draw\nData = StockDate.drop(['Date','Symbol','Adj_Close'],axis=1) \nData=Data.astype(float)\nax=Data.plot(secondary_y=['Volume'])\nax.set_ylabel('Value')\nax.right_ax.set_ylabel('Volume')\nplt.grid(True)\nplt.show()\n\n#Create more data\nvalue = pd.Series(Data['Close'].shift(-1)-Data['Close'],index=Data.index)\n#Data['Next_Open'] = Data['Open'].shift(-1) #Next day's Open data.\nData['High-Low'] = Data['High']-Data['Low'] #Difference between High and Low\nData['NOpen-Close']=Data['Open'].shift(-1)-Data['Close'] #Next Day's Open-today's Close\nData['Close-YClose']=Data['Close']-Data['Close'].shift(1) #Today is rise or fall\nData['Close-Open']=Data['Close']-Data['Open'] #today's Close - Open\nData['High-Close'] = Data['High']-Data['Close'] #today's High - Close\nData['Close-Low'] = Data['Close']-Data['Low'] #today's Close - Low\nvalue[value>=0]=1 #0 means rise\nvalue[value<0]=0 #1 means fall\nData=Data.dropna(how='any')\ndel(Data['Open'])\ndel(Data['Close'])\ndel(Data['High'])\ndel(Data['Low'])\n#print(Data)\nprint(type(Data))\n\n\n#Data['Value']=value\ncorrect = 0\ntrain_original=train\ni=0\nL=len(Data)\n'''\n#Classical classification, normal way\nData_train=Data[0:L-20]\nvalue_train = value[0:L-20]\nData_predict=Data[L-20:L]\nvalue_real = value[L-20:L]\nprint(Data_predict)\nclassifier = svm.SVC()\nclassifier.fit(Data_train,value_train)\nvalue_predict=classifier.predict(Data_predict)\nprint(\"value_real = \",value_predict)\nwhile i<19:\n\tprint(\"value_real = \",value_real[i])\n\tif(value_real[i]==int(value_predict[i])):\n\t\tcorrect=correct+1\n\ti+=1\nprint(\"Correct = \",correct/19*100,\"%\")\n'''\n#loop training,15 days data for train\nprint(L)\nwhile train<L:\n\tData_train=Data[train-train_original:train]\n\tvalue_train = value[train-train_original:train]\n\tData_predict=Data[train:train+1]\n\tvalue_real = value[train:train+1]\n\t#print(Data_train)\n\t#print(value_train)\n\n\tclassifier = svm.SVC(kernel='poly')#\n\tclassifier.fit(Data_train,value_train)\n\tvalue_predict=classifier.predict(Data_predict)\n\t#print(\"value_real = \",value_real[0])\n\t#print(\"value_predict = \",value_predict)\n\tif(value_real[0]==int(value_predict)):\n\t\tcorrect=correct+1\n\ttrain = train+1\ncorrect=correct/total_predict_data*100\nprint(\"Correct = \",correct,\"%\")\n\n'''\nprint(\"support_:\",classifier.support_)\nprint(\"support_vectors_:\",classifier.support_vectors_)\nprint(\"n_support_:\",classifier.n_support_)\n'''\n\n\n\n\n\n"
  },
  {
    "path": "IASP520_SVM_V2_With_Polarity_Json.py",
    "content": "#V2.0 \n#change all the data, now it's base on relation not price. \n#Increase the training Data set.\nfrom yahoo_finance import Share\nfrom pandas import Series,DataFrame\nimport pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn import svm\nimport json\n\n#Json\nwith open('GYN.json', 'r') as f:\n    RData = json.load(f)\nprint(RData.keys())\n\nyaho = Share('YHOO') #choose stock\nstartday='2015-11-1' #choose first day\nendday='2016-12-15' #choose end day\ntrain = 70 #How many data for train, 9 is the least.\n\n#Data processing\nStockDate = DataFrame(yaho.get_historical(startday, endday))\nStockDate.index = StockDate.Date\nStockDate = DataFrame.sort_index(StockDate) #sort\n\nRDate=[]\nRPolarity=[]\nfor key in RData.keys():\n\tRDate.append(RData[key]['Date'])\n\tRPolarity.append(RData[key]['Polarity'])\n#print(RDate,RPolarity)\nRDataPanda=pd.DataFrame(RPolarity,index=RDate,columns=['Polarity'])\n#print(RDataPanda)\nStockDate['Polarity']=RDataPanda['Polarity']\nStockDate.fillna(value=0,inplace=True)\n#print(StockDate['Polarity'])\n\n'''\nfor index in StockDate.index:\n\tprint(index)\n\tfor key in RData.keys():\n\t\tprint(index)\n\t\tprint(key)\n\t\tif (index==key):\n\t\t\tStockDate[index,'Polarity'] = RData[key]['Polarity']\nprint(StockDate['Polarity'])\n'''\n\nL=len(StockDate)\ntotal_predict_data=L-train\n\n#draw\nData = StockDate.drop(['Date','Symbol','Adj_Close'],axis=1) \nData=Data.astype(float)\nDataPic1=Data.drop(['Polarity'],axis=1)\nfig=plt.figure()\nax1=fig.add_subplot(111)\nAx1=DataPic1.plot(secondary_y=['Volume'],ax=ax1)\nAx1.set_ylabel('Value')\nAx1.right_ax.set_ylabel('Volume')\n\nplt.grid(True)\nplt.show()\n\n#Create more data\nvalue = pd.Series(Data['Close'].shift(-1)-Data['Close'],index=Data.index)\n#Data['Next_Open'] = Data['Open'].shift(-1) #Next day's Open data.\nData['High-Low'] = Data['High']-Data['Low'] #Difference between High and Low\nData['NOpen-Close']=Data['Open'].shift(-1)-Data['Close'] #Next Day's Open-today's Close\nData['Close-YClose']=Data['Close']-Data['Close'].shift(1) #Today is rise or fall\nData['Close-Open']=Data['Close']-Data['Open'] #today's Close - Open\nData['High-Close'] = Data['High']-Data['Close'] #today's High - Close\nData['Close-Low'] = Data['Close']-Data['Low'] #today's Close - Low\nvalue[value>=0]=1 #0 means rise\nvalue[value<0]=0 #1 means fall\nData=Data.dropna(how='any')\ndel(Data['Open'])\ndel(Data['Close'])\ndel(Data['High'])\ndel(Data['Low'])\n#print(Data)\nprint(type(Data))\n\n\n#Data['Value']=value\ncorrect = 0\ntrain_original=train\ni=0\nL=len(Data)\n'''\n#Classical classification, normal way\nData_train=Data[0:L-20]\nvalue_train = value[0:L-20]\nData_predict=Data[L-20:L]\nvalue_real = value[L-20:L]\nprint(Data_predict)\nclassifier = svm.SVC()\nclassifier.fit(Data_train,value_train)\nvalue_predict=classifier.predict(Data_predict)\nprint(\"value_real = \",value_predict)\nwhile i<19:\n\tprint(\"value_real = \",value_real[i])\n\tif(value_real[i]==int(value_predict[i])):\n\t\tcorrect=correct+1\n\ti+=1\nprint(\"Correct = \",correct/19*100,\"%\")\n'''\n#loop training,15 days data for train\nprint(L)\nwhile train<L:\n\tData_train=Data[train-train_original:train]\n\tvalue_train = value[train-train_original:train]\n\tData_predict=Data[train:train+1]\n\tvalue_real = value[train:train+1]\n\t#print(Data_train)\n\t#print(value_train)\n\n\tclassifier = svm.SVC(kernel='poly',degree=40)#kernel='poly',(gamma*u'*v + coef0)^degree\n\tclassifier.fit(Data_train,value_train)\n\tvalue_predict=classifier.predict(Data_predict)\n\t#print(\"value_real = \",value_real[0])\n\t#print(\"value_predict = \",value_predict)\n\tif(value_real[0]==int(value_predict)):\n\t\tcorrect=correct+1\n\ttrain = train+1\ncorrect=correct/total_predict_data*100\nprint(\"Correct = \",correct,\"%\")\n\n'''\nprint(\"support_:\",classifier.support_)\nprint(\"support_vectors_:\",classifier.support_vectors_)\nprint(\"n_support_:\",classifier.n_support_)\n'''\n\n\n\n\n\n"
  },
  {
    "path": "NQN.json",
    "content": "{\"2016-11-19\": {\"Polarity\": 0.2751415251415251, \"Date\": \"2016-11-19\", \"Epoch\": \"1479531600\"}, \"2016-11-21\": {\"Polarity\": 0.06792373769117956, \"Date\": \"2016-11-21\", \"Epoch\": \"1479704400\"}, \"2016-11-22\": {\"Polarity\": 0.5, \"Date\": \"2016-11-22\", \"Epoch\": \"1479790800\"}, \"2016-11-23\": {\"Polarity\": 0.14285714285714285, \"Date\": \"2016-11-23\", \"Epoch\": \"1479877200\"}, \"2016-11-25\": {\"Polarity\": 0.12731481481481483, \"Date\": \"2016-11-25\", \"Epoch\": \"1480050000\"}, \"2016-11-29\": {\"Polarity\": -0.02, \"Date\": \"2016-11-29\", \"Epoch\": \"1480395600\"}, \"2016-11-30\": {\"Polarity\": 0.35857713996945495, \"Date\": \"2016-11-30\", \"Epoch\": \"1480482000\"}, \"2016-12-02\": {\"Polarity\": 0.047245290719756014, \"Date\": \"2016-12-02\", \"Epoch\": \"1480654800\"}, \"2016-12-05\": {\"Polarity\": 0.1, \"Date\": \"2016-12-05\", \"Epoch\": \"1480914000\"}, \"2016-12-06\": {\"Polarity\": 0.5, \"Date\": \"2016-12-06\", \"Epoch\": \"1481000400\"}, \"2016-12-07\": {\"Polarity\": 0.5238095238095238, \"Date\": \"2016-12-07\", \"Epoch\": \"1481086800\"}, \"2016-12-08\": {\"Polarity\": 0.40145502645502645, \"Date\": \"2016-12-08\", \"Epoch\": \"1481173200\"}, \"2016-12-09\": {\"Polarity\": 0.3864942528735632, \"Date\": \"2016-12-09\", \"Epoch\": \"1481259600\"}, \"2016-12-10\": {\"Polarity\": 0.3333333333333333, \"Date\": \"2016-12-10\", \"Epoch\": \"1481346000\"}, \"2016-12-12\": {\"Polarity\": 0.008644647933339241, \"Date\": \"2016-12-12\", \"Epoch\": \"1481518800\"}, \"2016-12-14\": {\"Polarity\": 0.21833429390355386, \"Date\": \"2016-12-14\", \"Epoch\": \"1481691600\"}, \"2016-12-15\": {\"Polarity\": 0.21662705486234898, \"Date\": \"2016-12-15\", \"Epoch\": \"1481778000\"}, \"2016-12-16\": {\"Polarity\": 0.13751596460018634, \"Date\": \"2016-12-16\", \"Epoch\": \"1481864400\"}, \"2016-12-18\": {\"Polarity\": 1.0, \"Date\": \"2016-12-18\", \"Epoch\": \"1482037200\"}}"
  },
  {
    "path": "README.md",
    "content": "StockPrediction\n=========\nStock data come from Yahoo_finance by Python.\n\nNews data come from tm.plugin by R.\n\nARMIA\n===\nStep\n---\n1.Use Daubechies 4 wavelet to transform the Stock Data which comes from Yahoo_finance.\n\n2.Difference the time series make it stationary.\n\n3.Create ACF & Pacf pictures to find out p & q which is the parameter in ARIMA.\n\n4.Predict the stationary time series by ARIMA(p,q). Because this ARIMA package can't do difference bigger than 2, thus I don't use ARIMA(p,d,q).\n\n5.Revert difference which we do in step 2.\n\n|               ARIMA                |\n|:----------------------------------:|\n| ![Conclusion](pic/ARIMA_EX.png)    |\n\nSVM\n===\nNot good enough. I try to transform Price to the relation, like the relation between Open & Close attributes or today & yesterday.\nStock can't be Predicted only based on history stock data, so we pull in new data. It's still not good but much better than before.\n\n|               SVM                  |\n|:----------------------------------:|\n| ![Conclusion](pic/SVM_V2.0.png)    |\n"
  }
]