生而无味 无为
创始人
2024-04-13 20:52:06
0

没想到距离上次发文已经过了一年了,这一年,魔幻、魔幻、还是他娘的魔幻。
到处都在核酸,到处都在裁员,房产经济崩塌,各种公司、培训机构接连倒闭,我这个知识付费的韭菜被坑了八万多,几乎是来成都这一年攒下来的钱。
有时候不禁问自己,这样的意义是什么,为什么会来成都,为什么会做这个工作。
这是个肮脏的世界,而我并不是一个肮脏的玩家。
每天无意义的重复,加班,加班,加班,最后工资6000多,加上20天的加班才勉强到8000,来成都这两年,我失去了自由、失去了快乐,还好,房子这条最大的枷锁在我还未沦陷之前断裂,然而,大环境让我身心俱疲,这社会究竟怎么了,难道活着是一种赎罪吗,死了才是解脱吗。
我已经无药可救了,从一步错的时候,就注定步步皆错。无数次想要离开,却始终没有勇气。
大抵是废了吧。
每天表面上游戏人间,浑浑噩噩。一个连自己都不爱的人,如何能爱这个世界。

{
“cells”: [
{
“cell_type”: “code”,
“execution_count”: 1,
“id”: “922c6efd”,
“metadata”: {
“collapsed”: false,
“execution”: {
“iopub.execute_input”: “2021-06-03T15:45:40.370975Z”,
“iopub.status.busy”: “2021-06-03T15:45:40.364045Z”,
“iopub.status.idle”: “2021-06-03T20:11:42.698920Z”,
“shell.execute_reply”: “2021-06-03T20:11:42.699417Z”
},
“papermill”: {
“duration”: 15962.365668,
“end_time”: “2021-06-03T20:11:42.699616”,
“exception”: false,
“start_time”: “2021-06-03T15:45:40.333948”,
“status”: “completed”
},
“tags”: []
},
“outputs”: [
{
“data”: {
“text/html”: [
“\n”
],
“text/plain”: [

]
},
“metadata”: {},
“output_type”: “display_data”
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ipykernel_launcher.py:130: SettingWithCopyWarning: \n”,
“A value is trying to be set on a copy of a slice from a DataFrame.\n”,
“Try using .loc[row_indexer,col_indexer] = value instead\n”,
“\n”,
“See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n”,
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ipykernel_launcher.py:131: SettingWithCopyWarning: \n”,
“A value is trying to be set on a copy of a slice from a DataFrame.\n”,
“Try using .loc[row_indexer,col_indexer] = value instead\n”,
“\n”,
“See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n”,
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ipykernel_launcher.py:134: SettingWithCopyWarning: \n”,
“A value is trying to be set on a copy of a slice from a DataFrame.\n”,
“Try using .loc[row_indexer,col_indexer] = value instead\n”,
“\n”,
“See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/ipykernel_launcher.py:135: SettingWithCopyWarning: \n”,
“A value is trying to be set on a copy of a slice from a DataFrame.\n”,
“Try using .loc[row_indexer,col_indexer] = value instead\n”,
“\n”,
“See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第1个子模型acc:0.89048\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第2个子模型acc:0.88896\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第3个子模型acc:0.88898\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第4个子模型acc:0.88968\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第5个子模型acc:0.8874\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第6个子模型acc:0.88978\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第7个子模型acc:0.89156\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第8个子模型acc:0.88802\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第9个子模型acc:0.8881\n”
]
},
{
“name”: “stderr”,
“output_type”: “stream”,
“text”: [
“/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/lightgbm/basic.py:1551: UserWarning: Using categorical_feature in Dataset.\n”,
" warnings.warn(‘Using categorical_feature in Dataset.’)\n"
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“第10个子模型acc:0.89038\n”
]
},
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“0.8893340000000001\n”
]
}
],
“source”: [
“import pandas as pd\n”,
“import lightgbm as lgb\n”,
“from datetime import datetime # 高级封装接口\n”,
“from sklearn.preprocessing import LabelEncoder\n”,
“from sklearn.model_selection import KFold,StratifiedKFold\n”,
“from sklearn.metrics import accuracy_score\n”,
“le=LabelEncoder()\n”,
“train=pd.read_csv(‘data/data71852/train.csv’)\n”,
“test1=pd.read_csv(‘data/data71852/test1.csv’)\n”,
“\n”,
“features=train.drop([‘Unnamed: 0’,‘label’],axis=1)\n”,
“labels=train[‘label’]\n”,
“\n”,
“# 不参与建模的特征[‘os’,‘lan’,‘sid’]\n”,
“# os 为数据集中默认版本\n”,
“# sid 为唯一值\n”,
“# lan 为默认数值\n”,
“remove_list=[‘os’,‘lan’,‘sid’]\n”,
“col=features.columns.tolist()\n”,
“for i in remove_list:\n”,
" col.remove(i)\n",
“\n”,
“\n”,
“\n”,
“\n”,
“# 特征变换,对于数值过大的异常值 设置为0\n”,
“features[‘fea_hash’]=features[‘fea_hash’].map(lambda x:0 if len(str(x))>16 else int(x))\n”,
“features[‘fea1_hash’]=features[‘fea1_hash’].map(lambda x:0 if len(str(x))>16 else int(x))\n”,
“# 针对version,非数值类型 设置0\n”,
“features[‘version’]=features[‘version’].map(lambda x:int(x) if str(x).isdigit() else 0)\n”,
“\n”,
“# 特征筛选\n”,
“features=features[col]\n”,
“\n”,
“# 数据探索,找到导致1的关键特征值\n”,
“def find_key_feature(train,selected):\n”,
" temp0=train[train[‘label’]==0]\n",
" temp=pd.DataFrame(columns=[0,1])\n",
" temp[0]=temp0[selected].value_counts()/len(temp0)*100\n",
" temp1=train[train[‘label’]==1]\n",
" temp[1]=temp1[selected].value_counts()/len(temp1)100\n",
" temp[2]=temp[1]/temp[0]\n",
" # 选出大于10倍的特征\n",
" result=temp[temp[2]>10].sort_values(2,ascending=False).index\n",
" return result\n",
“key_feature={}\n”,
“\n”,
“selected_cols=[‘osv’, ‘apptype’, ‘carrier’, ‘dev_height’, ‘dev_ppi’,‘dev_width’, ‘media_id’, ‘ntt’, ‘package’,‘version’, ‘fea_hash’, ‘location’, ‘fea1_hash’,‘cus_type’]\n”,
“for selected in selected_cols:\n”,
" key_feature[selected]=find_key_feature(train,selected)\n",
“\n”,
“# 构造新特征,新特征字段=原始特征字段+1\n”,
“def f(x,selected):\n”,
" # 判断是否在关键特征值里,是1,否0\n",
" if x in key_feature[selected]:\n",
" return 1\n",
" else:\n",
" return 0\n",
“\n”,
“for selected in selected_cols:\n”,
" if len(key_feature[selected])>0:\n",
" features[selected+‘1’]=features[selected].apply(f,args=(selected, ))\n",
" test1[selected+‘1’]=test1[selected].apply(f,args=(selected, ))\n",
“\n”,
“# 确定类别特征\n”,
“cate_features=[‘apptype’,‘carrier’,‘ntt’,‘version’,‘location’,‘cus_type’]\n”,
“\n”,
“\n”,
“# 增加TimeStamp\n”,
“def get_date(features):\n”,
" features[‘timestamp’]=features[‘timestamp’].apply(lambda x:datetime.fromtimestamp(x/1000))\n",
" temp=pd.DatetimeIndex(features[‘timestamp’])\n",
" features[‘year’]=temp.year\n",
" features[‘month’]=temp.month\n",
" features[‘day’]=temp.day\n",
" features[‘week_day’]=temp.weekday\n",
" features[‘hour’]=temp.hour\n",
" features[‘minute’]=temp.minute\n",
“\n”,
" # 添加time_diff\n",
" start_time=features[‘timestamp’].min()\n",
" features[‘time_diff’]=features[‘timestamp’]-start_time\n",
" features[‘time_diff’]=features[‘time_diff’].dt.days
24+features[‘time_diff’].dt.seconds/3600\n",
" # 使用day,time_diff\n",
" features.drop([‘timestamp’,‘year’,‘month’,‘week_day’,‘minute’],axis=1,inplace=True)\n",
" return features\n",
“\n”,
“# 对训练集提取时间多尺度\n”,
“features=get_date(features)\n”,
“# 对测试集提取时间多尺度\n”,
“test1=get_date(test1)\n”,
“\n”,
“\n”,
“# 需要将训练集和测试集合并,然后统一做LabelEncoder\n”,
“all_df=pd.concat([train,test1])\n”,
“all_df[‘osv’]=all_df[‘osv’].astype(‘str’)\n”,
“all_df[‘osv’]=le.fit_transform(all_df[‘osv’])\n”,
“features[‘osv’]=all_df[all_df[‘label’].notnull()][‘osv’]\n”,
“\n”,
“\n”,
“# 采用交叉验证 ensemble model\n”,
“def ensemble_model(clf,train_x,train_y,test,cate_features):\n”,
" num=10\n",
" sk=StratifiedKFold(n_splits=num,shuffle=True,random_state=2021)\n",
" prob=[] # 记录最终结果\n",
" mean_acc=0 #记录平均准确率\n",
" for k,(train_index,val_index) in enumerate(sk.split(train_x,train_y)):\n",
" train_x_real=train_x.iloc[train_index]\n",
" train_y_real=train_y.iloc[train_index]\n",
" val_x=train_x.iloc[val_index]\n",
" val_y=train_y.iloc[val_index]\n",
" # 子模型训练\n",
" clf=clf.fit(train_x_real,train_y_real,categorical_feature=cate_features)\n",
" val_y_pred=clf.predict(val_x)\n",
" acc_val=accuracy_score(val_y,val_y_pred)\n",
" # 子模型评估\n",
" print(‘第{}个子模型acc:{}’.format(k+1,acc_val))\n",
" mean_acc+=acc_val/num\n",
" # 子模型预测0,1\n",
" test_y_pred=clf.predict_proba(test)[:,-1] # soft 得到概率值\n",
" prob.append(test_y_pred)\n",
" print(mean_acc)\n",
" mean_prob=sum(prob)/num\n",
" return mean_prob\n",
“\n”,
“# 测试集预测,保持与features中的columns一致\n”,
“test_features=test1[features.columns]\n”,
“\n”,
“# 特征变换,对于数值过大的异常值 设置为0\n”,
“test_features[‘fea_hash’]=test_features[‘fea_hash’].map(lambda x:0 if len(str(x))>16 else int(x))\n”,
“test_features[‘fea1_hash’]=test_features[‘fea1_hash’].map(lambda x:0 if len(str(x))>16 else int(x))\n”,
“# 对数据清洗,将V3=>3,V1=>1,V6=>6,V2=>2\n”,
“# 针对version,非数值类型 设置0\n”,
“test_features[‘version’]=test_features[‘version’].map(lambda x:int(x) if str(x).isdigit() else 0)\n”,
“test_features[‘osv’]=all_df[all_df[‘label’].isnull()][‘osv’]\n”,
“\n”,
“\n”,
“# 使用LightGBM训练\n”,
“clf=lgb.LGBMClassifier(\n”,
" num_leaves=2**7-1,\n",
" reg_alpha=0.5,\n",
" reg_lambda=0.5,\n",
" objective=‘binary’,\n",
" max_depth=-1,\n",
" learning_rate=0.005,\n",
" min_child_samples=3,\n",
" random_state=2021,\n",
" n_estimators=10000,\n",
" subsample=0.5,\n",
" colsample_bytree=0.5,\n",
“)\n”,
“result=ensemble_model(clf,features,labels,test_features,cate_features)\n”,
“\n”,
“# 保存结果\n”,
“a=pd.DataFrame(test1[‘sid’])\n”,
“a[‘label’]=result\n”,
“# 转换为二分类\n”,
“a[‘label’]=a[‘label’].apply(lambda x:0 if x<0.9 else 1)\n”,
“a.to_csv(‘baseline.csv’,index=False)”
]
}
],
“metadata”: {
“kernelspec”: {
“display_name”: “PaddlePaddle 2.0.0b0 (Python 3.5)”,
“language”: “python”,
“name”: “py35-paddle1.2.0”
},
“language_info”: {
“codemirror_mode”: {
“name”: “ipython”,
“version”: 3
},
“file_extension”: “.py”,
“mimetype”: “text/x-python”,
“name”: “python”,
“nbconvert_exporter”: “python”,
“pygments_lexer”: “ipython3”,
“version”: “3.7.4”
},
“papermill”: {
“default_parameters”: {},
“duration”: 15963.958064,
“end_time”: “2021-06-03T20:11:43.435185”,
“environment_variables”: {},
“exception”: null,
“input_path”: “/home/aistudio/2011322.ipynb”,
“output_path”: “/home/aistudio/.2011322.ipynb”,
“parameters”: {},
“start_time”: “2021-06-03T15:45:39.477121”,
“version”: “2.3.3”
}
},
“nbformat”: 4,
“nbformat_minor”: 5
}

相关内容

热门资讯

Python|位运算|数组|动... 目录 1、只出现一次的数字(位运算,数组) 示例 选项代...
张岱的人物生平 张岱的人物生平张岱(414年-484年),字景山,吴郡吴县(今江苏苏州)人。南朝齐大臣。祖父张敞,东...
西游西后传演员女人物 西游西后传演员女人物西游西后传演员女人物 孙悟空 六小龄童 唐僧 徐少华 ...
名人故事中贾岛作诗内容简介 名人故事中贾岛作诗内容简介有一次,贾岛骑驴闯了官道.他正琢磨着一句诗,名叫《题李凝幽居》全诗如下:闲...
和男朋友一起优秀的文案? 和男朋友一起优秀的文案?1.希望是惟一所有的人都共同享有的好处;一无所有的人,仍拥有希望。2.生活,...
戴玉手镯的好处 戴玉手镯好还是... 戴玉手镯的好处 戴玉手镯好还是碧玺好 女人戴玉?戴玉好还是碧玺好点佩戴手镯,以和田玉手镯为佳!相嫌滑...
依然什么意思? 依然什么意思?依然(汉语词语)依然,汉语词汇。拼音:yī    rán基本解释:副词,指照往常、依旧...
高尔基的散文诗 高尔基的散文诗《海燕》、《大学》、《母亲》、《童年》这些都是比较出名的一些代表作。
心在飞扬作者简介 心在飞扬作者简介心在飞扬作者简介如下。根据相关公开资料查询,心在飞扬是一位优秀的小说作者,他的小说作...
卡什坦卡的故事赏析? 卡什坦卡的故事赏析?讲了一只小狗的故事, 我也是近来才读到这篇小说. 作家对动物的拟人描写真是惟妙...
林绍涛为简艾拿绿豆糕是哪一集 林绍涛为简艾拿绿豆糕是哪一集第三十二集。 贾宽认为是阎帅间接导致刘映霞住了院,第二天上班,他按捺不...
小爱同学是女生吗小安同学什么意... 小爱同学是女生吗小安同学什么意思 小爱同学,小安同学说你是女生。小安是男的。
内分泌失调导致脸上长斑,怎么调... 内分泌失调导致脸上长斑,怎么调理内分泌失调导致脸上长斑,怎么调理先调理内分泌,去看中医吧,另外用好的...
《魔幻仙境》刺客,骑士人物属性... 《魔幻仙境》刺客,骑士人物属性加点魔幻仙境骑士2功1体质
很喜欢她,该怎么办? 很喜欢她,该怎么办?太冷静了!! 太理智了!爱情是需要冲劲的~不要考虑着考虑那~否则缘...
言情小说作家 言情小说作家我比较喜欢匪我思存的,很虐,很悲,还有梅子黄时雨,笙离,叶萱,还有安宁的《温暖的玄》 小...
两个以名人的名字命名的风景名胜... 两个以名人的名字命名的风景名胜?快太白楼,李白。尚志公园,赵尚志。
幼儿教育的代表人物及其著作 幼儿教育的代表人物及其著作卡尔威特的《卡尔威特的教育》,小卡尔威特,他儿子成了天才后写的《小卡尔威特...
海贼王中为什么说路飞打凯多靠霸... 海贼王中为什么说路飞打凯多靠霸气升级?凯多是靠霸气升级吗?因为之前刚到时确实打不过人家因为路飞的实力...
运气不好拜财神有用吗运气不好拜... 运气不好拜财神有用吗运气不好拜财神有没有用1、运气不好拜财神有用。2、拜财神上香前先点蜡烛,照亮人神...