'Research/Robot Vision' 카테고리의 글 목록

글

Aliasing

Research/Robot Vision 2015. 6. 29. 10:35

aliasing은 image scaling을 시도할 때 발생한다.

아래 사진을 보면, 이미지를 간단한 방식(image decimation)으로 suibsampling 했더니 이상한 패턴(aliasing)이 발생한다.

왜 발생하는 것일까?

신호처리에서 배웠던 nyquist thorem을 돌이켜 보자.

만약, 우리의 image가 2pixel을 주기로 intensity 신호를 갖는다면, 우리는 충분한 주기(4 pixel)로 샘플링을 해 주어야 같은 신호를 얻을 수 있다.

그렇지 않으면.. It will be confused with a low frequency signal after we have sampled.

aliasing을 제거하기 위해서는 image에서 high frequency 성분을 제거해야 한다.

The way to eliminate this alias in problem is to remove the high frequency components from the image

방법으로는 우선 Gaussian kernel을 통해 blurring을 한 후에 subsampling을 하면 된다.

* Ref

[1] https://moocs.qut.edu.au/courses/791/learn

'Research > Robot Vision' 카테고리의 다른 글

Morphology - Erode / Dilate image (0)	2015.05.27
Correlation and Convolution (1)	2015.05.22
Monadic operation, Diadic operation, Spatial operation (0)	2015.05.22
디스플레이 장치에서 감마(gamma)란 무엇인가? (0)	2015.05.21

설정

트랙백

글

Morphology - Erode / Dilate image

Research/Robot Vision 2015. 5. 27. 10:11

For morphological erosion, the output pixel is set if all of the input pixels in the structuring elements are set.

For morphological dilation, the output pixel is set if any of the input pixels in the structuring element are set.

* Ref

[1] https://moocs.qut.edu.au/courses/791/learn

'Research > Robot Vision' 카테고리의 다른 글

Aliasing (0)	2015.06.29
Correlation and Convolution (1)	2015.05.22
Monadic operation, Diadic operation, Spatial operation (0)	2015.05.22
디스플레이 장치에서 감마(gamma)란 무엇인가? (0)	2015.05.21

설정

트랙백

글

Correlation and Convolution

Research/Robot Vision 2015. 5. 22. 18:17

Convolution.pdf

사실 associativity를 생각해보면 가우시안의 convolution은 결국 작은 값의 convolution을 여러번 반복한 것과 같다. 그런데 그것보다는 큰 convolution을 한 번에 수행하는 것이 훨씬 computationally 효율적이겠지!

(마치 1+1+1+1... 하는 것보다 큰 수를 한 번에 더하는 것과 같은 느낌이랄까..)

https://www.youtube.com/watch?v=Ma0YONjMZLI

correlation과 convolution의 가장 큰 차이점은 convolution이 associative하다는 것이다!!

The key difference between the two is that convolution is associative.

(In general, people use convolution for image processing operations such as smoothing, and they use correlation to match a template to an image. Then, we don’t mind that correlation isn’t associative, because it doesn’t really make sense to combine two templates into one with correlation, whereas we might often want to combine two filter together for convolution. When discussing the Fourier series as a way of understanding filtering, we will use convolution, so that we can introduce the famous convolution theorem.)

Correlation은 아래와 같이 Image similarity를 보는 데 쓰인다.

(만약 Image가 같은 이미지인데 일정한 상수값이 곱해진 이미지라면, ZNCC 값이 1 인 것으로 같은 이미지라는 것을 알 수 있음)

- 아래는 image에서의 FFT

The "mathematical equations" are important, so don't skip them entirely. But the 2d FFT has an intuitive interpretation, too. For illustration, I've calculated the inverse FFT of a few sample images:

enter image description here

As you can see, only one pixel is set in the frequency domain. The result in the image domain (I've only displayed the real part) is a "rotated cosine pattern" (the imaginary part would be the corresponding sine).

If I set a different pixel in the frequency domain (at the left border):

enter image description here

I get a different 2d frequency pattern.

If I set more than one pixel in the frequency domain:

enter image description here

you get the sum of two cosines.

So like a 1d wave, that can be represented as a sum of sines and cosines, any 2d image can be represented (loosely speaking) as a sum of "rotated sines and cosines", as shown above.

when we take fft of a image in opencv, we get weird picture. What does this image denote?

It denotes the amplitudes and frequencies of the sines/cosines that, when added up, will give you the original image.

And what is its application?

There are really too many to name them all. Correlation and convolution can be calculated very efficiently using an FFT, but that's more of an optimization, you don't "look" at the FFT result for that. It's used for image compression, because the high frequency components are usually just noise.

* ref

[1] https://moocs.qut.edu.au/courses/791/learn

[2] https://www.youtube.com/watch?v=Ma0YONjMZLI

[3] http://dsp.stackexchange.com/questions/1637/what-does-frequency-domain-denote-in-case-of-images

[4] Correlation and Convolution, Class Notes for CMSC 426, Fall 2005 David Jacobs

(Important!)

'Research > Robot Vision' 카테고리의 다른 글

Aliasing (0)	2015.06.29
Morphology - Erode / Dilate image (0)	2015.05.27
Monadic operation, Diadic operation, Spatial operation (0)	2015.05.22
디스플레이 장치에서 감마(gamma)란 무엇인가? (0)	2015.05.21

설정

트랙백

글

Monadic operation, Diadic operation, Spatial operation

Research/Robot Vision 2015. 5. 22. 14:54

diadic operation에서 주의해야 할 점..

재미있는 예는 두 사진을 합성하는 예가 있다.

각각 0 과 1 의 logical image를 만들어서 1을 마스크로 생각하고...

원본 이미지와 multiplication을 하면.. 필요한 이미지만 따올 수가 있게 된다.

이렇게 해서 합성하면 된다~~

근데 window가 edge에서 fall off 해버리는 문제는 어떨까?

또 다른 문제가 있다. Window가 직사각형 모양인데, 진정한 평균의 의미를 가질 수 있는가?

(각각의 픽셀 값이 동등한 weight로 더해져서는 안되겠지.. 거리가 다르니까!)

그래서 거리의 문제를 없애기 위해서 circle을 적용해 보려고 했더니, digital image의 픽셀 자체가 circle을 만들 수가 없는 형태...

그래서 어느 정도 cover가 되는 지에 따라서 weight를 줄 수 있겠지!

이렇게, 마스크에 weight를 곱해주어서 계산을 한다.. 이 때 곱해주는 weight를 이루는 창을 kernel이라고 한다!!

그런데 이 kernel의 값이 너무 크면 안되니까 normalize를 해 준다.

가장 대표적으로 많이 쓰이는 kernel에는 gaussian kernel이 있다.

Gaussian kernel을 사용하기 위해서 가장 중요한 것 중 하나는 적절한 standard deviation을 설정하는 것!
(It controls the width of the kernel)

일반적으로는 kernel의 넓이(2h+1)의 절반(h)은 3 * standard deviation임!

실제 테스트를 해보면, 비슷한 블러링 효과를 주도록 파라미터를 조정해서 비교해보면 아래와 같다.

위 사진은 31 * 31 사이즈의 window를 적용한 operation이고 아래는 standard deviation 10의 gaussian kernel을 적용한 operation이다.

위 사진에서는 window의 isotropic하지 않은 특성 때문에 ringing effect가 발생하는 것이 보인다.
(눈 주변. 입 주변 등..)

* ref

[1] https://moocs.qut.edu.au/courses/791/learn

'Research > Robot Vision' 카테고리의 다른 글

Aliasing (0)	2015.06.29
Morphology - Erode / Dilate image (0)	2015.05.27
Correlation and Convolution (1)	2015.05.22
디스플레이 장치에서 감마(gamma)란 무엇인가? (0)	2015.05.21

설정

트랙백

글

디스플레이 장치에서 감마(gamma)란 무엇인가?

Research/Robot Vision 2015. 5. 21. 21:50

(구식 TV에 쓰인 장치를 참고로 한다..)

디스플레이 장치는 input voltate를 받아서 illumination을 만들어내는데,

이 때 input voltage와 illumination 간의 관계는 linear하지 않고, exponential 한 관계를 보인다.

이 때 exponent(power) number를 우리가 이야기하는 gamma라고 부른다.

따라서 우리가 카메라로 찍은 영상을 voltage로 변환하여 그대로 TV로 보냈다가는 실세계와 illumination에 엄청난 차이가 있을 수밖에 없었다.

그래서 TV 엔지니어들이 사용했던 방식은 inverse funtion을 집어넣는 것이었다.

그 결과, 실세계와 diplay 장치 간의 illumination에 linear한 관계를 얻을 수 있었다.

우리가 디스플레이 장치의 gamma를 조절하는 것은 이와 같은 수치를 조절하여 illumination을 조절하는 것이라고 보면 되겠다.

* Ref

[1] https://moocs.qut.edu.au/courses/791/learn

'Research > Robot Vision' 카테고리의 다른 글

Aliasing (0)	2015.06.29
Morphology - Erode / Dilate image (0)	2015.05.27
Correlation and Convolution (1)	2015.05.22
Monadic operation, Diadic operation, Spatial operation (0)	2015.05.22

된사람 되기

검색결과 리스트

Research/Robot Vision에 해당되는 글 5건

글

Aliasing

'Research > Robot Vision' 카테고리의 다른 글

설정

트랙백

댓글

글

Morphology - Erode / Dilate image

'Research > Robot Vision' 카테고리의 다른 글

설정

트랙백

댓글

글

Correlation and Convolution

'Research > Robot Vision' 카테고리의 다른 글

설정

트랙백

댓글

글

Monadic operation, Diadic operation, Spatial operation

'Research > Robot Vision' 카테고리의 다른 글

설정

트랙백

댓글

글

디스플레이 장치에서 감마(gamma)란 무엇인가?

'Research > Robot Vision' 카테고리의 다른 글

설정

트랙백

댓글

사이드 메뉴

COUNTER

CATEGORY

TAG

RECENT POSTS

RECENT COMMENT

RECENT TRACKBACK

ARCHIVE

CALENDAR

NOTICE

MY LINK

티스토리툴바