<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>Disqus - Latest Comments for shantanuo</title><link>http://disqus.com/by/shantanuo/</link><description></description><atom:link href="http://disqus.com/shantanuo/comments.rss" rel="self"></atom:link><language>en</language><lastBuildDate>Mon, 12 Aug 2024 05:42:58 -0000</lastBuildDate><item><title>Re: Deliver Amazon CloudWatch logs to Amazon OpenSearch Serverless</title><link>https://aws.amazon.com/blogs/big-data/deliver-amazon-cloudwatch-logs-to-amazon-opensearch-serverless/#comment-6525788647</link><description>&lt;p&gt;Thanks for this excellent article. But I think it will be easier if you make it available as an ingestion pipeline. (like CloudTrail and VPC flow log)&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Mon, 12 Aug 2024 05:42:58 -0000</pubDate></item><item><title>Re: Integrate your data and collaborate using data preparation in AWS Glue Studio</title><link>https://aws.amazon.com/blogs/aws/integrate-your-data-and-collaborate-using-data-preparation-in-aws-glue-studio/#comment-6503280692</link><description>&lt;p&gt;Thank you for this tool. I have used opneRefine before.  &lt;a href="https://github.com/OpenRefine/OpenRefine" rel="nofollow noopener" target="_blank" title="https://github.com/OpenRefine/OpenRefine"&gt;https://github.com/OpenRefine/OpenRefine&lt;/a&gt;  But Glue Studio is better to process big data.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Tue, 16 Jul 2024 04:13:04 -0000</pubDate></item><item><title>Re: Ingest and analyze your data using Amazon OpenSearch Service with Amazon OpenSearch Ingestion</title><link>https://aws.amazon.com/blogs/big-data/ingest-and-analyze-your-data-using-amazon-opensearch-service-with-amazon-opensearch-ingestion/#comment-6494508362</link><description>&lt;p&gt;1) Is it possible to write a cloudformation template to create SQS queue, S3 bucket and IAM role mentioned in the first few steps?&lt;br&gt;2) The animated GIF files are playing at a speed that is too fast, making them difficult to comprehend.&lt;/p&gt;&lt;p&gt;edit: I successfully managed to ingest the data after considerable trial and error. Thank you for the excellent article.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Thu, 04 Jul 2024 04:49:19 -0000</pubDate></item><item><title>Re: Handle tables without primary keys while creating Amazon Aurora MySQL or Amazon RDS for MySQL zero-ETL integrations with Amazon Redshift</title><link>https://aws.amazon.com/blogs/database/handle-tables-without-primary-keys-while-creating-amazon-aurora-mysql-or-amazon-rds-for-mysql-zero-etl-integrations-with-amazon-redshift/#comment-6444139176</link><description>&lt;p&gt;Thanks for the zero-ETL integration feature. Using DMS (Data Migration Service) is  expensive and complicated compared to this.&lt;br&gt;It goes without saying that every table should have a Primary Key.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Thu, 25 Apr 2024 03:49:26 -0000</pubDate></item><item><title>Re: Amazon OpenSearch Serverless now supports automated time-based data deletion </title><link>https://aws.amazon.com/blogs/big-data/amazon-opensearch-serverless-now-supports-automated-time-based-data-deletion/#comment-6375071739</link><description>&lt;p&gt;Very useful information. &lt;br&gt;But there is no example even if it has been mentioned that it is possible to create a data lifecycle policy using CLI or CloudFormation.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Thu, 25 Jan 2024 03:13:23 -0000</pubDate></item><item><title>Re: Power neural search with AI/ML connectors in Amazon OpenSearch Service</title><link>https://aws.amazon.com/blogs/big-data/power-neural-search-with-ai-ml-connectors-in-amazon-opensearch-service/#comment-6375063321</link><description>&lt;p&gt;By default, the template deploys the Hugging Face sentence-transformers model. Can I use text-embedding-ada-002 by openai?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Thu, 25 Jan 2024 02:47:31 -0000</pubDate></item><item><title>Re: Use Amazon Athena with Spark SQL for your open-source transactional table formats</title><link>https://aws.amazon.com/blogs/big-data/use-amazon-athena-with-spark-sql-for-your-open-source-transactional-table-formats/#comment-6375043461</link><description>&lt;p&gt;Is there a cloudformation template that will take care of requirements mentioned in Prerequisites section?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Thu, 25 Jan 2024 01:39:42 -0000</pubDate></item><item><title>Re: How to Receive Alerts When Your IAM Configuration Changes</title><link>https://aws.amazon.com/blogs/security/how-to-receive-alerts-when-your-iam-configuration-changes/#comment-6269772145</link><description>&lt;p&gt;is there a cloudformation template to deploy this easily?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Sun, 03 Sep 2023 05:18:22 -0000</pubDate></item><item><title>Re: Perform upserts in a data lake using Amazon Athena and Apache Iceberg</title><link>https://aws-blogs-prod.amazon.com/big-data/perform-upserts-in-a-data-lake-using-amazon-athena-and-apache-iceberg/#comment-6197408115</link><description>&lt;p&gt;very nice article. Thank you for the step by step guide. &lt;br&gt;But got an error mismatched input '&amp;lt;eof&amp;gt;'. Expecting: '%', ')', '*',  in last statement.... MERGE INTO curated_demo.sporting_event t USING (SELECT op, ...&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Wed, 31 May 2023 04:53:23 -0000</pubDate></item><item><title>Re: Debug AWS DMS tasks using Time Travel</title><link>https://aws.amazon.com/blogs/database/debug-aws-dms-tasks-using-time-travel/#comment-5980910417</link><description>&lt;p&gt;Time Travel seems to be available for PostgreSQL to either PostgreSQL or MySQL. I will like to see sql-server to MySQL support. Is that possible in the future?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Wed, 14 Sep 2022 03:31:25 -0000</pubDate></item><item><title>Re: Supercharging Dream11’s Data Highway with Amazon Redshift RA3 clusters</title><link>https://aws.amazon.com/blogs/big-data/supercharging-dream11s-data-highway-with-amazon-redshift-ra3-clusters/#comment-5900844790</link><description>&lt;p&gt;Nice article. But you have mentioned "the newer version of the automated AWS CloudFormation-based toolset (now on GitHub), was not available." and it seems that the links are not working.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Tue, 28 Jun 2022 04:00:24 -0000</pubDate></item><item><title>Re: Optimize performance and reduce costs for network analytics with VPC Flow Logs in Apache Parquet format</title><link>https://aws.amazon.com/blogs/big-data/optimize-performance-and-reduce-costs-for-network-analytics-with-vpc-flow-logs-in-apache-parquet-format/#comment-5681107345</link><description>&lt;p&gt;Thanks for the article. It works as expected. But I have a question...&lt;br&gt;&lt;a href="https://stackoverflow.com/questions/70630441/read-partitioned-data-of-vpc-flow-log" rel="nofollow noopener" target="_blank" title="https://stackoverflow.com/questions/70630441/read-partitioned-data-of-vpc-flow-log"&gt;https://stackoverflow.com/q...&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Sat, 08 Jan 2022 02:50:21 -0000</pubDate></item><item><title>Re: Improve Amazon Athena query performance using AWS Glue Data Catalog partition indexes</title><link>https://aws.amazon.com/blogs/big-data/improve-amazon-athena-query-performance-using-aws-glue-data-catalog-partition-indexes/#comment-5679347658</link><description>&lt;p&gt;Thanks for your reply. One more question. Once you add data for the year 2022 to S3, can I query using Athena?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Fri, 07 Jan 2022 04:17:29 -0000</pubDate></item><item><title>Re: What’s new in Amazon Redshift – 2021, a year in review</title><link>https://aws.amazon.com/blogs/big-data/whats-new-in-amazon-redshift-2021-a-year-in-review/#comment-5673460182</link><description>&lt;p&gt;Redshift announced support for Lambda UDFs in Oct 2020. It was not in the year 2021, but worth a mention!  &lt;a href="https://aws.amazon.com/about-aws/whats-new/2020/10/amazon-redshift-announces-support-for-lambda-udfs-and-enables-tokenization/" rel="nofollow noopener" target="_blank" title="https://aws.amazon.com/about-aws/whats-new/2020/10/amazon-redshift-announces-support-for-lambda-udfs-and-enables-tokenization/"&gt;https://aws.amazon.com/abou...&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Tue, 04 Jan 2022 02:48:50 -0000</pubDate></item><item><title>Re: Extending Pandas | Dr. Bryan Patrick Wood's Website</title><link>https://bpw1621.com/archive/extending-pandas/#comment-5658058252</link><description>&lt;p&gt;Nice article. But there is a typo:&lt;br&gt;The output as mentioned in the article is wrong.&lt;/p&gt;&lt;p&gt;&lt;code&gt;0    zAR&lt;br&gt;1    zAZ&lt;/code&gt;&lt;/p&gt;&lt;p&gt;It shoud be Z (capital Z) and not small z&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Fri, 24 Dec 2021 23:41:24 -0000</pubDate></item><item><title>Re: Improve Amazon Athena query performance using AWS Glue Data Catalog partition indexes</title><link>https://aws.amazon.com/blogs/big-data/improve-amazon-athena-query-performance-using-aws-glue-data-catalog-partition-indexes/#comment-5644714385</link><description>&lt;p&gt;What ill be the charges if I complete all the steps mentioned in this tutorial?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Tue, 14 Dec 2021 00:01:53 -0000</pubDate></item><item><title>Re: Introducing Amazon Redshift Serverless – Run Analytics At Any Scale Without Having to Manage Data Warehouse Infrastructure</title><link>https://aws.amazon.com/blogs/aws/introducing-amazon-redshift-serverless-run-analytics-at-any-scale-without-having-to-manage-infrastructure/#comment-5634085168</link><description>&lt;p&gt;It is mentioned in the article that "To control your costs, you can specify usage limits and define actions that Amazon Redshift automatically takes." But I can not find that option from console.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Sun, 05 Dec 2021 06:20:26 -0000</pubDate></item><item><title>Re: Choosing between storage mechanisms for ML inferencing with AWS Lambda</title><link>https://aws.amazon.com/blogs/compute/choosing-between-storage-mechanisms-for-ml-inferencing-with-aws-lambda/#comment-5620119307</link><description>&lt;p&gt;awesome post. But I have a doubt.  what if it takes more than 30 seconds to return the results? will it timeout?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Wed, 24 Nov 2021 03:29:12 -0000</pubDate></item><item><title>Re: Use pre-trained financial language models for transfer learning in Amazon SageMaker JumpStart</title><link>https://aws.amazon.com/blogs/machine-learning/use-pre-trained-financial-language-models-for-transfer-learning-in-amazon-sagemaker-jumpstart/#comment-5564794836</link><description>&lt;p&gt;Thanks for sharing this. I will certainly try it. But how much does it (sagemaker endpoint) cost?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Sat, 09 Oct 2021 06:56:28 -0000</pubDate></item><item><title>Re: Hosting Hugging Face models on AWS Lambda for serverless inference</title><link>https://aws.amazon.com/blogs/compute/hosting-hugging-face-models-on-aws-lambda/#comment-5537903608</link><description>&lt;p&gt;Thanks for this very useful article. But can you also mention the cost?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Fri, 17 Sep 2021 02:41:12 -0000</pubDate></item><item><title>Re: Dynamic image resizing with Python and Serverless framework</title><link>https://serverless.com/blog/dynamic-image-resizing-python/#comment-5441109694</link><description>&lt;p&gt;This is very interesting. But you should make it more developer friendly. There is a "Suggest a Bot" link, but that is not enough. A programmer should be able to submit his docker image as a new bot.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Fri, 02 Jul 2021 03:17:55 -0000</pubDate></item><item><title>Re: Accessing external components using Amazon Redshift Lambda UDFs</title><link>https://aws.amazon.com/blogs/big-data/accessing-external-components-using-amazon-redshift-lambda-udfs/#comment-5129151788</link><description>&lt;p&gt;Awesome. Thanks for your help. If in case my lambda function takes time for e.g. 15 to 20 minutes, How is that handled by redshift?&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Thu, 29 Oct 2020 00:59:02 -0000</pubDate></item><item><title>Re: Accessing external components using Amazon Redshift Lambda UDFs</title><link>https://aws.amazon.com/blogs/big-data/accessing-external-components-using-amazon-redshift-lambda-udfs/#comment-5129116773</link><description>&lt;p&gt;Can you suggest how to rewrite the lambda function code if it looks like this... &lt;br&gt;&lt;a href="https://gist.github.com/shantanuo/29bf5a1466f537a9969668543054825b" rel="nofollow noopener" target="_blank" title="https://gist.github.com/shantanuo/29bf5a1466f537a9969668543054825b"&gt;https://gist.github.com/sha...&lt;/a&gt;&lt;/p&gt;&lt;p&gt;I need to count the number of input variables and return the same number of results those are returned from that API.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Wed, 28 Oct 2020 23:56:04 -0000</pubDate></item><item><title>Re: Accessing external components using Amazon Redshift Lambda UDFs</title><link>https://aws.amazon.com/blogs/big-data/accessing-external-components-using-amazon-redshift-lambda-udfs/#comment-5128083149</link><description>&lt;p&gt;Interesting. I tried it and have a question that I asked on stack overflow.&lt;/p&gt;&lt;p&gt;&lt;a href="https://stackoverflow.com/questions/64570889/redshift-user-defined-lambda-function-returns-error" rel="nofollow noopener" target="_blank" title="https://stackoverflow.com/questions/64570889/redshift-user-defined-lambda-function-returns-error"&gt;https://stackoverflow.com/q...&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Wed, 28 Oct 2020 09:37:49 -0000</pubDate></item><item><title>Re: Analyzing Amazon S3 server access logs using Amazon ES</title><link>https://aws.amazon.com/blogs/big-data/analyzing-amazon-s3-server-access-logs-using-amazon-es/#comment-5124054380</link><description>&lt;p&gt;Thanks for sharing this. But I am getting DeprecationWarning.&lt;/p&gt;&lt;p&gt;&lt;i&gt;You are using the put() function from 'botocore.vendored.requests'.  This dependency was removed from Botocore and will be removed from Lambda after 2021/01/30. Install the requests package, 'import requests' directly, and use the requests.put() function instead.&lt;/i&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">shantanuo</dc:creator><pubDate>Sun, 25 Oct 2020 05:32:03 -0000</pubDate></item></channel></rss>