Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

数据库 同步 到底该怎么做 ? #6

Open
HbnKing opened this issue Jan 11, 2019 · 1 comment
Open

数据库 同步 到底该怎么做 ? #6

HbnKing opened this issue Jan 11, 2019 · 1 comment

Comments

@HbnKing
Copy link
Owner

HbnKing commented Jan 11, 2019

一般情况下 我们都会用sqoop 将 传统的数据库 导入到 hive 做 数据分析
sqoop 支持数据的 增量的同步 ,增量更新 。但是对于 删除操作 不能更新到hive 数据库 中
思路如下 :
① 普通的增量同步,增量更新 继续 原有的更新 。
② 有删除操作时 将 原表的主键 一次 导入到 主键表中
③ 将主键表 和 原有的 hive 表 做 一个 inner jion 即可

image

insert overwrite table lijie_table
select 
    a.id,a.name.a.addr 
from
    lijie_table a
inner join
    lijie_table_tmp b
on
    a.id = b.id
@HbnKing HbnKing changed the title 数据库增量 同步 到底该怎么做 ? 数据库 同步 到底该怎么做 ? Jan 11, 2019
@HbnKing
Copy link
Owner Author

HbnKing commented Jan 11, 2019

26911547195672_ pic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant