Azure动手实验 - 使用Azure Data Factory 迁移数据
创始人
2024-03-24 01:01:49
0

该实验使用 Azure CosmosDB,这个实验的点在于:

1:使用了 cosmicworks 生成了实验数据

2:弄清楚cosmosDB 的 accout Name 与 database id 和 container id 关系。

3:创建了 ADF 的连接和任务,让数据从 cosmicworks 数据库的 products 容器,迁移到 cosmicworks数据库的 flatproducts 容器。

实验来自于:练习:使用 Azure 数据工厂迁移现有数据 - Training | Microsoft Learn

Migrate existing data using Azure Data Factory

In Azure Data Factory, Azure Cosmos DB is supported as a source of data ingest and as a target (sink) of data output.

In this lab, we will populate Azure Cosmos DB using a helpful command-line utility and then use Azure Data Factory to move a subset of data from one container to another.

Create and seed your Azure Cosmos DB SQL API account

You will use a command-line utility that creates a cosmicworks database and a products container at 4,000 request units per second (RU/s). Once created, you will adjust the throughput down to 400 RU/s.

To accompany the products container, you will create a flatproducts container manually that will be the target of the ETL transformation and load operation at the end of this lab.

  1. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  2. Sign into the portal using the Microsoft credentials associated with your subscription.

  3. Select + Create a resource, search for Cosmos DB, and then create a new Azure Cosmos DB SQL API account resource with the following settings, leaving all remaining settings to their default values:

    SettingValue
    SubscriptionYour existing Azure subscription
    Resource groupSelect an existing or create a new resource group
    Account NameEnter a globally unique name
    LocationChoose any available region
    Capacity modeProvisioned throughput
    Apply Free Tier DiscountDo Not Apply
    Limit the total amount of throughput that can be provisioned on this accountUnchecked

    📝 Your lab environments may have restrictions preventing you from creating a new resource group. If that is the case, use the existing pre-created resource group.

  4. Wait for the deployment task to complete before continuing with this task.

  5. Go to the newly created Azure Cosmos DB account resource and navigate to the Keys pane.

  6. This pane contains the connection details and credentials necessary to connect to the account from the SDK. Specifically:

    1. Record the value of the URI field. You will use this endpoint value later in this exercise.

    2. Record the value of the PRIMARY KEY field. You will use this key value later in this exercise.

  7. Close your web browser window or tab.

  8. Start Visual Studio Code.

    📝 If you are not already familiar with the Visual Studio Code interface, review the Get Started guide for Visual Studio Code

  9. In Visual Studio Code, open the Terminal menu and then select New Terminal to open a new terminal instance.

  10. Install the cosmicworks command-line tool for global use on your machine.

    dotnet tool install --global cosmicworks

    💡 This command may take a couple of minutes to complete. This command will output the warning message (*Tool 'cosmicworks' is already installed') if you have already installed the latest version of this tool in the past.

  11. Run cosmicworks to seed your Azure Cosmos DB account with the following command-line options:

    OptionValue
    --endpointThe endpoint value you copied earlier in this lab
    --keyThe key value you coped earlier in this lab
    --datasetsproduct
    cosmicworks --endpoint  --key  --datasets product

    📝 For example, if your endpoint is: https­://dp420.documents.azure.com:443/ and your key is: fDR2ci9QgkdkvERTQ==, then the command would be: cosmicworks --endpoint https://dp420.documents.azure.com:443/ --key fDR2ci9QgkdkvERTQ== --datasets product

  12. Wait for the cosmicworks command to finish populating the account with a database, container, and items.

  13. Close the integrated terminal.

  14. Close Visual Studio Code.

  15. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  16. Sign into the portal using the Microsoft credentials associated with your subscription.

  17. Select Resource groups, then select the resource group you created or viewed earlier in this lab, and then select the Azure Cosmos DB account resource you created in this lab.

  18. Within the Azure Cosmos DB account resource, navigate to the Data Explorer pane.

  19. In the Data Explorer, expand the cosmicworks database node, expand the products container node, and then select Items.

  20. Observe and select the various JSON items in the products container. These are the items created by the command-line tool used in previous steps.

  21. Select the Scale & Settings node. In the Scale & Settings tab, select Manual, update the required throughput setting from 4000 RU/s to 400 RU/s and then Save your changes**.

  22. In the Data Explorer pane, select New Container.

  23. In the New Container popup, enter the following values for each setting, and then select OK:

    SettingValue
    Database idUse existing | cosmicworks
    Container idflatproducts
    Partition key/category
    Container throughput (autoscale)Manual
    RU/s400
  24. Back in the Data Explorer pane, expand the cosmicworks database node and then observe the flatproducts container node within the hierarchy.

  25. Return to the Home of the Azure portal.

Create Azure Data Factory resource

Now that the Azure Cosmos DB SQL API resources are in place, you will create an Azure Data Factory resource and configure all of the necessary components and connections to perform a one-time data movement from one SQL API container to another to extract data, transform it, and load it to another SQL API container.

  1. Select + Create a resource, search for Data Factory, and then create a new Azure Data Factory resource with the following settings, leaving all remaining settings to their default values:

    SettingValue
    SubscriptionYour existing Azure subscription
    Resource groupSelect an existing or create a new resource group
    NameEnter a globally unique name
    RegionChoose any available region
    VersionV2
    Git configurationConfigure Git later

    📝 Your lab environments may have restrictions preventing you from creating a new resource group. If that is the case, use the existing pre-created resource group.

  2. Wait for the deployment task to complete before continuing with this task.

  3. Go to the newly created Azure Data Factory resource and select Open Azure Data Factory Studio.

    💡 Alternatively, you can navigate to (adf.azure.com/home), select your newly created Data Factory resource, and then select the home icon.

  4. From the home screen. Select the Ingest option to begin the quick wizard to perform a one-time copy data at scale operation and move to the Properties step of the wizard.

  5. Starting with the Properties step of the wizard, in the Task type section, select Built-in copy task.

  6. In the Task cadence or task schedule section, select Run once now and then select Next to move to the Source step of the wizard.

  7. In the Source step of the wizard, in the Source type list, select Azure Cosmos DB (SQL API).

  8. In the Connection section, select + New connection.

  9. In the New connection (Azure Cosmos DB (SQL API)) popup, configure the new connection with the following values, and then select Create:

    SettingValue
    NameCosmosSqlConn
    Connect via integration runtimeAutoResolveIntegrationRuntime
    Authentication methodAccount key | Connection string
    Account selection methodFrom Azure subscription
    Azure subscriptionYour existing Azure subscription
    Azure Cosmos DB account nameYour existing Azure Cosmos DB account name you chose earlier in this lab
    Database namecosmicworks
  10. Back in the Source data store section, within the Source tables section, select Use query.

  11. In the Table name list, select products.

  12. In the Query editor, delete the existing content and enter the following query:

    SELECT p.name, p.categoryName as category, p.price 
    FROM products p
  13. Select Preview data to test the query's validity. Select Next to move to the Target step of the wizard.

  14. In the Target step of the wizard, in the Target type list, select Azure Cosmos DB (SQL API).

  15. In the Connection list, select CosmosSqlConn.

  16. In the Target list, select flatproducts and then select Next to move to the Settings step of the wizard.

  17. In the Settings step of the wizard, in the Task name field, enter FlattenAndMoveData.

  18. Leave all remaining fields to their default blank values and then select Next to move to the final step of the wizard.

  19. Review the Summary of the steps you have selected in the wizard and then select Next.

  20. Observe the various steps in the deployment. When the deployment has finished, select Finish.

  21. Close your web browser window or tab.

  22. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  23. Sign into the portal using the Microsoft credentials associated with your subscription.

  24. Select Resource groups, then select the resource group you created or viewed earlier in this lab, and then select the Azure Cosmos DB account resource you created in this lab.

  25. Within the Azure Cosmos DB account resource, navigate to the Data Explorer pane.

  26. In the Data Explorer, expand the cosmicworks database node, select the flatproducts container node, and then select New SQL Query.

  27. Delete the contents of the editor area.

  28. Create a new SQL query that will return all documents where the name is equivalent to HL Headset:

    SELECT p.name, p.category, p.price 
    FROMproducts p
    WHEREp.name = 'HL Headset'
  29. Select Execute Query.

  30. Observe the results of the query.

相关内容

热门资讯

200匹马力的摩托车,相当于多... 200匹马力的摩托车,相当于多少排量的自然吸气汽油发动机?摩托车200匹马力相当于汽车自然吸气的2....
找几个女孩爱听的笑话,要爆笑的... 找几个女孩爱听的笑话,要爆笑的。   2月30日 星期一 晴  今天一天都没有出太阳,真不好,...
求《蛇蝎女佣》第三季资源,谢谢... 求《蛇蝎女佣》第三季资源,谢谢~作品相关简介:《蛇蝎女佣》是2012-2013年的一部季中剧,将于2...
昂立无诀珠心算--项目介绍 昂立无诀珠心算--项目介绍孩子正在学,有些效果,希望宝加油!
小学生英语五年级下册单词表第7... 小学生英语五年级下册单词表第7课strawberry草莓 grape葡萄 like喜欢 some一些...
大话西游2三世界男魔转什么最好... 大话西游2三世界男魔转什么最好?这个就要看你喜欢什么啦!3世法满男魔转什么都是超敏的建议你转个女魔吧...
关于天龙八部中峨眉的灭式剑? 关于天龙八部中峨眉的灭式剑?别用灭剑了,浪费哦,绝剑加快回怒的速度,灭剑的血上限降低的状态消失时他的...
历史上过度依赖别人而失败的例子 历史上过度依赖别人而失败的例子张士诚——依赖自己的弟弟——结果他弟弟被杀,立刻一蹶不振吕布,依赖自己...
人活着本来就很累 很难 生活苦 人活着本来就很累 很难 生活苦对啊,可是还得好好生活,我每天都有很多烦恼。但我还想好好的生活,所以说...
师生关系有什么意义 师生关系有什么意义也可以是朋友一曰为师,终身为父当我们讨论意义的时候我们一般是在问这2个问题。1、对...
我给前女友发红包,他给我回你有... 我给前女友发红包,他给我回你有这个心就好了。我真的不能收 什么意思?既然是前女友她是不愿意欠你人情所...
描写春天的句子有哪些.短句 描写春天的句子有哪些.短句绿油油的小草从土地里钻出头来,树上的花儿也竞放出鲜艳的小花朵,到处都是一片...
炸酱面的做法详细步骤 炸酱面的做法详细步骤这道菜是在看一期美食节目里学习到的,当时教的是韩式炸酱面,我自己根据家里有的现有...
邻居知道我家是超市还在门口卖水... 邻居知道我家是超市还在门口卖水怎么办?邻居知道你家是超市在门口卖水又怎样呢?他又管不了你,如果你是正...
星际穿越中塔斯最终进入黑洞,那... 星际穿越中塔斯最终进入黑洞,那凯斯最后去哪了?凯斯跟着艾米去艾米她最爱那人发现的星球了。
绿萝叫黄金葛吗? 绿萝叫黄金葛吗?黄金葛是绿萝的其中一个品种
女儿剪了短发被同学嘲笑怎么办? 女儿剪了短发被同学嘲笑怎么办? 首先要让你的女儿树立正确的观念。她现在是一个可爱的小姑娘。...
撩完就跑的男的都是什么心态 撩完就跑的男的都是什么心态仅仅享受撩的过程,或者是见你回应不够多就找下一个了,在这个年代,很难会有得...
初星 的韩文怎么写?歌名哦~!... 初星 的韩文怎么写?歌名哦~!还有帮我翻译一句话。谢谢~!  김희철-초별  金希澈-初星  많이 ...
我的世界:我的家外经常会有奇怪... 我的世界:我的家外经常会有奇怪的建筑,家里有时会放出恐怖声音,难道是him?世上本无鬼,只是你太害怕...