Using (x, y) or (y, x) as pixel access for a C++ image class? [on hold]

Using (x, y) or (y, x) as pixel access for a C++ image class? [on hold]

I've been wondering what the best (==least prone to user errors) approach would be for a pixel access operator for an image class in C++. operator() is overloaded to access pixels at a location x, y in the image - but the question is should that access be (x, y) or (y, x)? Obviously, I think in principle, we'd be more used to (x, y) for image pixel access, but in mathematics, matrix access is usually (row, col), and since for an image, y is normally the row and x the col, that would result in (y, x).
Note this question is only about the API, the access - and not memory storage (row/col-major) or anything like that.

operator()

x, y

In Python, it's common to use NumPy or scipy.ndimage to represent images, which use (row, col) (equal to (y, x)).

So, (row, col) access is quite common, because images are 2D arrays or matrices (or 3D in the case of 3-channel images). In OpenCV and Python, cv::Mat as well as np.ndarray are used to represent both images and matrices, so they use (row, col) ((y, x)) access. The same goes for MATLAB. On the other hand, pure image libraries like Boost.GIL or Selene use (x, y), because it seems more intuitive for an image class.

cv::Mat

np.ndarray

(row, col)

(y, x)

(x, y)

This now gets even more interesting if we are designing a C++ image class and then expose that class to Python with bindings like Boost.Python, where it is then represented as NumPy array in Python. So one could be in a situation where in C++ the access is (x, y), but in Python it's then (y, x). Disaster would probably ensue, even if this is properly documented.

(x, y)

(y, x)

Is (y, x) the clear winner here? What can be some arguments for either design decision?

(y, x)

If one goes with (y, x) - how about the order of width and height when constructing an image? For an image, I think we perceive it as the more natural convention to specify the width first, i.e. Image(width, height). For example we would also specify an image as 640x480 - the width always goes first.But this again conflicts with the matrix convention where the rows come first and we would say Image(rows, cols), which would be Image(height, width).

(y, x)

width

height

Image(width, height)

Image(rows, cols)

Image(height, width)

Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise. If this question can be reworded to fit the rules in the help center, please edit the question.

The choice is completely arbitrary, and depends on the target audience. As long as your API is self-consistent (i.e. always does it the same way) users of it can generally cope. The direction is also completely arbitrary - for example, whether y increases in an upward direction (which happens in some but not all scientific domains) or downward direction (which is now things get rendered on the screen) i.e. top left is smallest values of x and y). The transformation between different representations is trivial, regardless.
– Peter
yesterday

y

x

y

Flagged - Opinion based
– Richard Critten
yesterday

A lot of libraries equate matrices and images, but they are not the same thing. I think MATLAB started this trend, all the others you mentioned followed suit. Before MATLAB, there never was an (y,x) indexing. Matricex are to indexed as (i,j), and 1-based, with i the row number. Images are to be indexed (x,y), and 0-based. That is the Only Way (i.e. my opinion).
– Cris Luengo
yesterday

I would choose (row,col) if my library is about matrices (like matlab and opencv) and (col, row) if my library is about images. Remember that opencv provides both, a (y,x) and a cv::Point(x,y) access to matrices!
– Micka
yesterday

A similar question is big endian vs. little endian. Gulliver tried to justice this with a Solomon's like judgement and ended up with nobody satisfied...
– Scheff
yesterday

2 Answers
2

This question really is a matter of preference. Usually images are (x,y) with (0,0) being in the upper left corner. Again though, i suppose anything can be used as long as its relatively uniform throughout your library. It is all library dependent.

images "often" start in bottom-left (because of graph figure image behaviour), but that's not the memory order.
– Micka
yesterday

Iv almosy never seen one that starts in the bottom left api wise, but again, its a matter of choice
– lordseanington
yesterday

3d rendering apis and mathematical graph apis might, because the mapping is more natural.
– Micka
yesterday

I've used textures in 3D graphics and I've seen where the coordinates of the texture mapping started at the top left, bottom left and the center of the image. These are the 3 most common that I have seen and in that order.
– Francis Cugler
39 secs ago

This is a good question, but it primarily will lead to many opinion based answers. I will explain why below. I'll vote to close as this is off topic.

The first thing to consider is perspective. From what position on a specific image or texture is its starting position: (0,0)? Is it the top left, bottom left or in some domains is it at the center of the image? This all has to do with the representation of how the image file itself stores its data to represent the image to be drawn to the screen.

One thing that may help you is when working with images, try not to think of their coordinates as in (x,y). Many applications that work with graphics specifically when rendering a texture and mapping it to some defined shape uses either (u,v) or (s,t) as their convention. Here is the most common form and use of (u,v) as being the texture or pixel coordinates and the direction that the pixel indices originate and increase from within that texture or image:

Texture Coords

Maybe this will help you to make your decision on which convention you want to use and as others have stated it is important to stay with that particular convention once you start using it. One of the reasons this is important is if you ever get involved with doing 3D graphics in a 3D Scene that has a local camera, the initial setup and construction of the camera object in relation to the world or scene is arbitrary on what convention you decide to use. There are 2 main conventions to use in a 3D Scene and they are RHS & LHS coordinate systems. This is no different than choosing a convention of how an image is represented by an array even if the array is multiple dimensions the data is still stored contiguously although when working with higher dimensional objects such as a 2D square image where the data is stored in a 1D flat array then you have to worry about the stride, but that is beyond the scope of this discussion.

“but it primarily will lead to many opinion based answers.” This means it is out of topic for SO and thus you should vote to close, it answer.
– Cris Luengo
yesterday

@CrisLuengo Yes, that may be so, but at the same time I chose to illustrate to the OP why it is opinion based with a physical representation and explained what measures to take.
– Francis Cugler
22 hours ago

Yes, and I agree with what you wrote. That is not the point. You have 3k+ rep on the site, and are expected to help moderate. The question is explicitly off topic, if you notice this, you are expected to vote to close, and to not answer. Otherwise there is no way of containing the topics of the site, and it will become a free-for-all. meta.stackoverflow.com/q/262573/7328782
– Cris Luengo
21 hours ago

@CrisLuengo I was going to vote to close but by the time I received your feedback it was already closed. So I went ahead and edited my answer stating that I'm in an agreement to vote to close. There are some questions that people have confusion about and I think they could use some clarification as to why it is being voted closed instead of it just being closed without any explanation. I believe this is one of them. Some questions are just to vague and don't need any explanation.
– Francis Cugler
2 mins ago

搜尋此網誌

Ciugk