I don't care if the comments are being stripped off the emojis, or converted as long as it does not trigger an error to the end-user. The simpler the solution the better. I found a few modules, for example the Emoji Scrub module that isn't covered by the security advisory policy.
INFO org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - Launching audit Page on //accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/
INFO org.asqatasun.service.command.AbstractScenarioAuditCommandImpl - Loading content for //accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/
ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - Incorrect string value: '\xF0\x9F\x99\x82" ...' for column 'Source' at row 1
WARN com.sebuilder.interpreter.SeInterpreter - org.hibernate.exception.GenericJDBCException: could not execute statement
WARN com.sebuilder.interpreter.SeInterpreter - org.hibernate.exception.GenericJDBCException: could not execute statement
WARN org.asqatasun.scenarioloader.ScenarioLoaderImpl - {"url":"//accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/","negated":false} failed.
INFO org.asqatasun.service.command.AbstractScenarioAuditCommandImpl - //accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/ has been loaded
INFO org.asqatasun.service.command.AuditCommandImpl - Adapting //accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/
WARN org.asqatasun.service.AuditServiceImpl - Audit has no corrected DOM
INFO org.asqatasun.service.command.AuditCommandImpl - //accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/ has been adapted
WARN org.asqatasun.service.command.AuditCommandImpl - Audit status isERROR whilePROCESSING was required
WARN org.asqatasun.service.command.AuditCommandImpl - Audit status isERROR whileCONSOLIDATION was required
WARN org.asqatasun.service.command.AuditCommandImpl - Audit status isERROR whileANALYSIS was required
INFO org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - Audit page terminated on //accessibility.blog.gov.uk/2016/09/02/dos-and-donts-on-designing-for-accessibility/
The final two characters are both 'MULTIPLE MUSICAL NOTES' [U+1F3B6], for which the UTF-8 encoding is
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
3.
The
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
4 field in my table is encoded in
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
5. But when I try to store the tweet in that column I get the following error message:
Incorrect string value: '\xF0\x9F\x8E\xB6\xF0\x9F...' for column 'tweet_text' at row 1.
What is going wrong? How can I fix this? I need to store multiple languages as well and this character set works for all languages but not for the special characters like emoticons and emojis.
This is my create table statement:
CREATE TABLE `twitter_status_data` [
`unique_status_id` bigint[20] NOT NULL AUTO_INCREMENT,
`metadata_result_type` text CHARACTER SET utf8,
`created_at` text CHARACTER SET utf8 NOT NULL COMMENT 'UTC time when this Tweet was created.',
`id` bigint[20] unsigned NOT NULL COMMENT 'Unique tweet identifier',
`id_str` text CHARACTER SET utf8 NOT NULL,
`tweet_text` text COMMENT 'Actual UTF-8 text',
`user_id_str` text CHARACTER SET utf8,
`user_name` text COMMENT 'User''s name',
`user_screen_name` text COMMENT 'Twitter handle',
`coordinates` text CHARACTER SET utf8,
PRIMARY KEY [`unique_status_id`],
KEY `user_id_index` [`user_id`],
FULLTEXT KEY `tweet_text_index` [`tweet_text`]
] ENGINE=InnoDB AUTO_INCREMENT=82451 DEFAULT CHARSET=utf8mb4;
asked Dec 5, 2013 at 21:46
db1db1
2,9693 gold badges16 silver badges13 bronze badges
13
I was finally able to figure out the issue. I had to change some settings in mysql configuration my.ini This article helped a lot
First i changed the character set in my.ini to utf8mb4 Next i ran the following commands in mysql client
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
Use the following command to check that the changes are made
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
answered Dec 6, 2013 at 17:00
db1db1
2,9693 gold badges16 silver badges13 bronze badges
5
I had hit the same problem and learnt the following-
Even though database has a default character set of utf-8, it's possible for database columns to have a different character set in MySQL. Modified dB and the problematic column to UTF-8:
mysql> ALTER DATABASE MyDB CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci'
mysql> ALTER TABLE database.table MODIFY COLUMN column_name VARCHAR[255] CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;
Now creating new tables with:
twitter_id_str VARCHAR[255] NOT NULL UNIQUE,
twitter_screen_name VARCHAR[512] CHARACTER SET utf8 COLLATE utf8_unicode_ci,
.....
] CHARACTER SET utf8 COLLATE utf8_unicode_ci;
answered Dec 6, 2013 at 19:27
VishalVishal
1,2531 gold badge11 silver badges17 bronze badges
It may be obvious, but it still was surprising to me, that
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
6 is not compatible with
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
5 encoding. So for some apps changing table/column encoding was not enough. I had to change encoding in app configuration.
Redmine [ruby, ROR]
In
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
8:
production:
adapter: mysql2
database: redmine
host: localhost
username: redmine
password: passowrd
encoding: utf8mb4
Custom Yii application [PHP]
In
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
9:
return [
'class' => yii\db\Connection::class,
'dsn' => 'mysql:host=localhost;dbname=yii',
'username' => 'yii',
'password' => 'password',
'charset' => 'utf8mb4',
],
If you have
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
5 as a column/table encoding and still getting errors like this, make sure that you have configured correct charset for DB connection in your application.
answered Jul 12, 2018 at 16:14
rob006rob006
21.6k5 gold badges54 silver badges74 bronze badges
2
Change database charset and collation
ALTER DATABASE
database_name
CHARACTER SET = utf8mb4
COLLATE = utf8mb4_unicode_ci;
change specific table's charset and collation
ALTER TABLE
table_name
CONVERT TO CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
change connection charset in mysql driver
before
charset=utf8&parseTime=True&loc=Local
after
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
0
From this article //hackernoon.com/today-i-learned-storing-emoji-to-mysql-with-golang-204a093454b7
answered Dec 11, 2019 at 8:17
GiangGiang
2,4242 gold badges25 silver badges26 bronze badges
According to the create table statement, the default charset of the table is already utf8mb4. It seems that you have a wrong connection charset.
In Java, set the datasource url like this:
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
1
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
1 is necessary for using utf8mb4.
It works for my application.
Nakilon
35k14 gold badges108 silver badges143 bronze badges
answered Dec 13, 2018 at 8:50
2
FOR SQLALCHEMY AND PYTHON
The encoding used for Unicode has traditionally been 'utf8'. However, for MySQL versions 5.5.3 on forward, a new MySQL-specific encoding 'utf8mb4' has been introduced, and as of MySQL 8.0 a warning is emitted by the server if plain utf8 is specified within any server-side directives, replaced with utf8mb3. The rationale for this new encoding is due to the fact that MySQL’s legacy utf-8 encoding only supports codepoints up to three bytes instead of four. Therefore, when communicating with a MySQL database that includes codepoints more than three bytes in size, this new charset is preferred, if supported by both the database as well as the client DBAPI, as in:
SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
2
answered Jun 2, 2019 at 14:34
I had use an emoji in my string that was the reason for this error.
So make sure you are not using some incorrect string that is not valid to save into the database.
answered Apr 7, 2021 at 12:03
MD SHAYONMD SHAYON
7,20747 silver badges40 bronze badges
As others said, it's because you are trying to save a 4 bytes of data into less space.
If you are facing the similar issue in java and don't have the flexibility to change the charset and collate encoding of database than this answer is for you.
you can use the Emoji Java library to achieve the same. You can convert into alias before saving/updating into database and convert back to unicode post save/update/load from database. The main benefit is readability of the text even after the encoding because this library only alias the emoji's rather than whole string.
answered Aug 12, 2022 at 6:24
Shivang AgarwalShivang Agarwal
1,8451 gold badge14 silver badges20 bronze badges
I changed MySQL settings and still the same. Finally I used the function utf8_decode[] on the string before insert.