Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

drm/i915: Fix startup failure in LRC mode after recent init changes

A previous commit introduced engine init changes:

commit 372ee59699d9 ("drm/i915: Only init engines once")

This broke execlists as intel_lr_context_render_state_init was trying to emit
commands to the RCS for the default context before the ring->init_hw was called.

Made a new gen8_init_rcs_context function and assign in to render ring
init_context. Moved call to intel_logical_ring_workarounds_emit into
gen8_init_rcs_context to maintain previous functionality.

Moved call to render_state_init from lr_context_deferred_create into
gen8_init_rcs_context, and modified deferred_create to call ring->init_context
for non-default contexts.

Modified i915_gem_context_enable to call ring->init_context for the default
context.

So init_context will now always be called when the hw is ready - in
i915_gem_context_enable for the default context and in lr_context_deferred_create
for other contexts.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert pxvid to extvid lookup table to a function

The conversion table can be replaced with simple enough function.

   text    data     bss     dec     hex filename
839688   10987      24 850699   cfb0b drivers/gpu/drm/i915/i915.ko
839224   10987      24 850235   cf93b drivers/gpu/drm/i915/i915.ko

Result is 494 saved bytes (.05525%).

v2: - no run on sentences from subject (Chris, Jani)
    - be verbose about the savings (Chris, Daniel)

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Flatten engine init control flow

Now that sanity prevails and we have the clean split between software
init and starting the engines we can drop all the "have we allocate
this struct already?" nonsense.

Execlist code could benefit quite a bit more still, but that's for
another patch.

Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Remove unnecessary goto in intel_primary_plane_disable()

The same logic can be implemented without it, and it even saves a line
of code.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Only init engines once

We can do this.

And now there's finally the clean split between software setup and
hardware setup I kinda wanted since multi-ring support was merged
aeons ago. It only took almost 5 years.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Move intel_init_pipe_control out of engine->init_hw

With this all the ->init_hw hooks really only set up hw state needed
to start the ring, all the software state setup and memory/buffer
allocations happen beforehand.

v2: We need to call intel_init_pipe_control after the ring init since
otherwise engine->dev is NULL and it falls over. Currently that's
now after the hw ring is enabled but a) we'll be fine as long as no
one submits a batch b) this will change soon.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: s/init()/init_hw()/ in intel_engine_cs

This is (mostly, some exceptions that need fixing) the hw setup
function which starts the ring. And not the function which allocates
all the resources.

Make this clear by giving it a better name.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Consolidate ring freespace calculations

There are numerous places in the code where the driver's idea of
how much space is left in a ring is updated using the driver's
latest notions of the positions of 'head' and 'tail' for the ring.
Among them are some that update one or both of these values before
(re)doing the calculation. In particular, there are four different
places in the code where 'last_retired_head' is copied to 'head'
and then set to -1; and two of these do not have a guard to check
that it has actually been updated since last time it was consumed,
leaving the possibility that the dummy -1 can be transferred from
'last_retired_head' to 'head', causing the space calculation to
produce 'impossible' results (previously seen on Android/VLV).

This code therefore consolidates all the calculation and updating of
these values, such that there is only one place where the ring space
is updated, and it ALWAYS uses (and consumes) 'last_retired_head' if
(and ONLY if) it has been updated since the last call.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make ring freespace calculation more robust

The used space in a ring is given by the cyclic distance from the
consumer (HEAD) to the producer (TAIL), i.e. ((tail-head) MOD size);
conversely, the available space in a ring is the cyclic distance
from the producer to the consumer, MINUS the amount reserved for a
"gap" that is supposed to guarantee that the producer never catches
up with or overruns the consumer. Note that some GEN h/w requires
that TAIL never approach to within one cacheline of HEAD, so the gap
is usually set to twice the cacheline size to ensure this.

While the existing code gives the correct answer for correct inputs,
if the producer HAS overrun into the reserved space, the result can
be a value larger than the maximum valid value (size-reserved). We
can improve this by reorganising the calculation, so that in the
event of overrun the result will be negative rather than over-large.

This means that the commonly-used test (available >= required)
will then reject further writes into the ring after an overrun,
giving some chance that we can recover from or at least diagnose
the original problem; whereas allowing more writes would likely both
confuse the h/w and destroy the evidence of what went wrong.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/dsi: add ports to intel_dsi to describe the ports being driven

Later on this can include multiple ports (e.g. (1 << PORT_A) | (1 <<
PORT_C)) to describe dual link DSI.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Gaurav K Singh <gaurav.k.singh@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/dsi: clean up MIPI DSI pipe vs. port usage

MIPI DSI works on ports A and C, which map to pipes A and B,
respectively. Things are going to get more complicated with the
introduction of dual link DSI support, so clean up the register defines
and code to match reality.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Gaurav K Singh <gaurav.k.singh@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Deal with video overlay on GPU reset

Clear the video overlay state on GPU reset. Any pending overlay request
in the ring has been nuked, and the display itself gets reset. So we
pretty much lose all state here. Adjust the software state to match so
that the next "putimage" will restore things to working order.

v2: Ass a locking check into intel_overlay_release_old_vid() (Daniel)

Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/0/NULL/ to appease sparse, reported by 0-day tester.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert 'trace_irq' to use requests rather than seqnos

Updated the trace_irq code to use requests instead of seqnos. This includes
reference counting the request object to ensure it sticks around when required.
Note that getting access to the reference counting functions means moving the
inline i915_trace_irq_get() function from intel_ringbuffer.h to i915_drv.h.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Resolve conflict due to shuffled merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove redundant flip_work->flip_queued_ring

Similar to the patch from John which removed obj->ring.

Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Remove the now redundant 'obj->ring'

The ring member of the object structure was always updated with the
last_read_seqno member. Thus with the conversion to last_read_req, obj->ring is
now a direct copy of obj->last_read_req->ring. This makes it somewhat redundant
and potentially misleading (especially as there was no comment to explain its
purpose).

This checkin removes the redundant field. Many uses were simply testing for
non-null to see if the object is active on the GPU. Some of these have been
converted to check 'obj->active' instead. Others (where the last_read_req is
about to be used anyway) have been changed to check obj->last_read_req. The rest
simply pull the ring out from the request structure and proceed as before.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert 'i915_seqno_passed' calls into 'i915_gem_request_completed'

Almost everywhere that caled i915_seqno_passed() was really asking 'has the
given seqno popped out of the hardware yet?'. Thus it had to query the current
hardware seqno and then do a signed delta comparison (which copes with wrapping
around zero but not with seqno values more than 2GB apart, although the latter
is unlikely!).

Now that the majority of seqno instances have been replaced with request
structures, it is possible to convert this test to be request based as well.
There is now a 'i915_gem_request_completed()' function which takes a request and
returns true or false as appropriate. Note that this currently just wraps up the
original _passed() test but a later patch in the series will reduce this to
simply returning a cached internal value, i.e.:
_completed(req) { return req->completed; }'

This checkin converts almost all _seqno_passed() calls. The only one left is in
the semaphore code which still requires seqnos not request structures.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Drop hunk touching the trace_irq code since I've dropped the
patch which converts that, and resolve resulting conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Connect requests to rings at creation not submission

It makes a lot more sense (and makes future seqno -> request conversion patches
simpler) to fill in the 'ring' field of the request structure at the point of
creation rather than submission. Given that the request structure is assigned by
ring specific code and thus is locked to a ring from the start, there really is
no reason to defer this assignment.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert 'ring_idle()' to use requests not seqnos

More seqno value to request structure conversions. Note, this change temporarily
moves the 'get_seqno()' call inside ring_idle() but this will disappear again in
a later patch when i915_seqno_passed() itself is converted.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert trace functions from seqno to request

All the code above is now using requests not seqnos so it is possible to convert
the trace functions across. Note that rather than get into problematic reference
counting issues, the trace code only saves the seqno and ring values from the
request structure not the structure pointer itself.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert 'flip_queued_seqno' into 'flip_queued_request'

Converted the flip_queued_seqno value to be a request structure as part of the
on going seqno to request changes. This includes reference counting the request
being saved away to ensure it can not be retired and freed while the flip code
is still waiting on it.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Again get rid of the _irq request unref by simply moving that
into the unpin worker. Doesn't matter when we hang onto the request
for a bit longer, and in the unpin worker we already grab the
dev->struct_mutex anyway.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove obsolete seqno parameter from 'i915_add_request'

There is no longer any need to retrieve a seqno value from an i915_add_request()
call. The calling code already knows which request structure is being processed
(it can only be ring->OLR). And as the request itself is now used in preference
to the basic seqno value, the latter is now redundant in this situation.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert __wait_seqno() to __wait_request()

Now that all code above is using request structures instead of seqno values, it
is possible to convert __wait_seqno() itself. Internally, it is still calling
i915_seqno_passed(), this will be updated later in the series. This step is just
changing the parameter list and function name.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert mmio_flip::seqno to struct request

Converted the mmio_flip 'seqno' value to be a request structure as part of the
on going seqno to request changes. This includes reference counting the request
being saved away to ensure it can not be retired and freed while the flip code
is still waiting on it.

v2: Used the IRQ friendly request dereference call in the notify handler as that
code is called asynchronously without holding any useful mutex locks.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Drop the _irq variant and use the normal reques unref,
wrapped in dev->struct_mutex per the discussion on the m-l.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Check locking in i915_gem_request_unreference

With refcounting it looks like you can just drop that refcount, but
that's not really the case. So make sure no one forgets.

Motivated by the unlocked call in the mmio flip code.

Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Convert i915_wait_seqno to i915_wait_request

Updated i915_wait_seqno() to take a request structure instead of a seqno value
and renamed it accordingly. Internally, it just pulls the seqno out of the
request and calls on to __wait_seqno() as before. However, all the code further
up the stack is now simplified as it can just pass the request object straight
through without having to peek inside.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Squash in hunk from an earlier patch which was rebased
wrongly.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert 'last_flip_req' to be a request not a seqno

Converted 'last_flip_req' to be an actual request rather than a seqno value as
part of the on going seqno to request changes. This includes reference counting
the request being saved away to ensure it can not be retired and freed while the
overlay code is still waiting on it.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Make 'i915_gem_check_olr' actually check by request not seqno

Updated the _check_olr() function to actually take a request object and compare
it to the OLR rather than extracting seqnos and comparing those.

Note that there is one use case where the request object being processed is no
longer available at that point in the call stack. Hence a temporary copy of the
original function is still present (but called _check_ols() instead). This will
be removed in a subsequent patch.

Also, downgraded a BUG_ON to a WARN_ON as apparently the former is frowned upon
for shipping code.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove 'outstanding_lazy_seqno'

The OLS value is now obsolete. Exactly the same value is guarateed to be always
available as PLR->seqno. Thus it is safe to remove the OLS completely. And also
to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention
valid.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Ensure requests stick around during waits

Added reference counting of the request structure around __wait_seqno() calls.
This is a precursor to updating the wait code itself to take the request rather
than a seqno. At that point, it would be a Bad Idea for a request object to be
retired and freed while the wait code is still using it.

v3:

Note that even though the mutex lock is held during a call to i915_wait_seqno(),
it is still necessary to explicitly bump the reference count. It appears that
the shrinker can asynchronously retire items even though the mutex is locked.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Remove wrongly squashed hunk which breaks the build.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Convert i915_gem_ring_throttle to use requests

Convert the throttle code to use the request structure rather than extracting a
ring/seqno pair from it and using those. This is in preparation for
__wait_seqno() becoming __wait_request().

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Replace last_[rwf]_seqno with last_[rwf]_req

The object structure contains the last read, write and fenced seqno values for
use in syncrhonisation operations. These have now been replaced with their
request structure counterparts.

Note that to ensure that objects do not end up with dangling pointers, the
assignments of last_*_req include reference count updates. Thus a request cannot
be freed if an object is still hanging on to it for any reason.

v2: Corrected 'last_rendering_' to 'last_read_' in a number of comments that did
not get updated when 'last_rendering_seqno' became 'last_read|write_seqno'
several millenia ago.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add helper functions to aid seqno -> request transition

Added helper functions for retrieving the ring and seqno entries from a request
structure. This allows the internal workings of the request structure to be
hidden from code that is using these. It also allows for useful
workarounds/debug code to be added as or when necessary.

Note that it is intended that the majority (if not all) uses of the seqno
accessor will disappear eventually as code is updated to use the request
structure itself rather than working with seqno values.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add reference count to request structure

The plan is to use request structures everywhere that seqno values were
previously used. This means saving pointers to structures in places that used to
be simple integers. In turn, that means that the target structure now needs much
more stringent lifetime tracking. That is, it must not be freed while some other
random object still holds a pointer to it.

To achieve this tracking, a reference count needs to be added. Whenever a
pointer to the structure is saved away, the count must be incremented and the
free must only occur when all references have been released.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Ensure OLS & PLR are always in sync

The aim is to replace seqno values with request structures. A step along the way
is to switch to using the PLR in preference to the OLS. That requires the PLR to
only be valid when and only when the OLS is also valid. I.e., the two must be
kept in lock step. Then, code which was using the OLS can be safely switched
over to using the PLR instead.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix short description of intel_display_power_is_enabled()

That's the version actually taking the dev_priv->power_domains lock.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Assert that we successfully downclock the GPU before suspend

Before suspending, we wait upon the outstanding GPU requests and flush
our pending idle handlers. This should downclock the GPU to its lowest
power state. Add a WARN to check that the delayed tasks were run and did
their job properly.

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove user pinning code

Now unused.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Don't read 'HEAD' MMIO register in LRC mode

The logical ring code was updating the software ring 'head' value
by reading the hardware 'HEAD' register. In LRC mode, this is not
valid as the hardware is not necessarily executing the same context
that is being processed by the software. Thus reading the h/w HEAD
could put an unrelated (undefined, effectively random) value into
the s/w 'head' -- A Bad Thing for the free space calculations.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Check for matching ringbuffer in logical_ring_wait_request()

The request queue is per-engine, and may therefore contain requests
from several different contexts/ringbuffers. In determining which
request to wait for, this function should only consider requests
from the ringbuffer that it's checking for space, and ignore any
that it finds that belong to other contexts.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Enable PSR for Baytrail and Braswell.

This patch is the last in series of VLV/CHV PSR,
that finally enable PSR by adding it to HAS_PSR
and calling the proper enable and disable
functions on the right places.

Although it is still disabled by default.

v2: Rebase over intel_psr and merge Durgadoss's fixes.
v3: Fix typo.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: VLV/CHV PSR debugfs.

Add debugfs support for Valleyview and Cherryview considering that
we have PSR per pipe and we don't have any kind of
performance counter as we have on other platforms that support PSR.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: VLV/CHV PSR Software timer mode

This patch introduces exit/activate functions for PSR
on VLV+. Since on VLV+ HW cannot track frame updates and force PSR
exit let's use fully SW tracking available.

v2: Rebase over intel_psr.c;
    Remove Single Frame update transitioning from state 3 to 5 directly;
    Fake a software invalidation for sprites and cursor so we don't miss
    any screen update;

v3: As pointed out by Durgadoss msecs_to_jiffies used on wait_for only uses int,
    so let's use 1 instead. Althought the 1/4 of this is needed for the
    transition let's use 1 for simplicity;
    Also fix comments as suggested by Durgadoss

Cc: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PSR VLV/CHV: Introduce setup, enable and disable functions

The biggest difference from HSW/BDW PSR here is that VLV enable_source
function enables PSR but let it in Inactive state. So it might be called
on early stage along with setup and enable_sink ones.

v2: Rebase over intel_psr.c;
    Remove docs from static functions;
    Merge vlv_psr_active_on_pipe;
    Timeout for psr transition is 250us;
    Remove SRC_TRASMITTER_STATE;

v3: Rebase after is_psr_enabled function got removed;
    Get SRC_TRANSMITTER_STATE back to be on the safe side since
    default for panels is to require link training on exit when
    main link off;
    As pointed out by Durgadoss msecs_to_jiffies used on wait_for only uses int,
    so let's use 1 instead. Althought the 1/4 of this is needed for the
    transition let's use 1 for simplicity;

Cc: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Add PSR registers for PSR VLV/CHV.

Baytrail (Valleyview) and Braswell (Cherryview) uses a complete different
implementation of PSR that we currently have supported for
Haswell and Broadwell. So let's start by adding registers definitions.

I usually don't like commit that adds just registers without using,
but after I put all in one commit I realized that no one would want
to take the AR to review it so I decided to split in order to make
reviewer's life easier. Only last commit in this series will actually
enable the PSR on intel enable panel path.

But as it happens currently with HSW/BDW the plan is to let it
disabled by default (protected by kernel parameter)
while we are able to fully validate it.

v2: Remove a unused bit definition that isn't used on vlv and
reserved on chv as pointed out by Durgadoss.

Cc: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Remove intel_psr_is_enabled function.

This function was in use to check if PSR feature got enabled.
However on HSW and BDW we currently force psr exit by disabling
EDP_PSR_ENABLE bit at EDP_PSR_CTL(dev). So this function was actually
returning the active/inactive state that is different from the enable/disable
meaning and had the risk of false negative.

But anyway this check with DRRS was dangerous, since DRRS could try to get enabled
before PSR gets there. So let's just remove it for now.
A proper synchronization mechanism must be implemented later probably
using pipe config.

Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: remove PSR BDW single frame update.

Single frame update is a feature available on BDW for PSR that allows
Source to send Sink only one frame and get it updated. Usually useful
when page flipping. However with our frontbuffer tracking where we force
psr exit on flips we don't need this feature.

Also after it got added here many workaround was added to documentation
to mask some bits when using single frame update. So the safest thing
is to just stop using it.

v2: Rebase after removing skip aux one and fixing typo on commit message.

Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: PSR get full link off x standby from VBT

OEMs can specify if full_link might be always enabled, i.e. only_standby
over VBT.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: HSW/BDW PSR Set idle_frames = VBT + 1

Let's use VBT + 1 now we parse it.

v2: fix subject

v3: rebase over intel_psr and without counting on previous fix

Cc: Arthur Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Parse VBT PSR block.

PSR (aka SRD) block is defined at VBT and currently being used.
Mainly/At-least to configure the amount of idle_frames require to get
back to PSR Entry.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Implement the skl version of MMIO flips

Because the plane registers are different in Skylake we need to adapt
the MMIO code as well.

v2: Don't introduce yet another vfunc when the direction is do
consolidate the plane updates to use the same code path (Daniel)

v3:
  - Use enum pipe instead of int (Ville)
  - Also update PLANE_STRIDE when the tiling has changed (Ville)
  - Put intel_mark_page_flip_active() in the shared code (Damien)

v4:
  - Remove unused variable

v5:
  - Fix whitespace Vs tabs (Ville)

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Read out crtl1 for eDP/DPLL0

v2: Put the DPLL0 state readout in skylake_get_ddi_pll(), closer to
where the PLL assignement read out is done rather than the frequency
readout function. (Daniel)

v3: Remove stray new line (Damien)
Add Paulo's r-b tag for v1

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (v1)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Reject modeset when the same digital port is used more than once

On pre-HSW we have two encoders per digital port: one HDMI, one DP.
However they are the same physical port in hardware and we can't enable
both at the same time. Reject the modeset if the user attempts this.

So far we've been saved by the fact that we never see both HDMI and DP
connectors as connected. But if the user decides to force a mode anyway,
all kinds of funny stuff might happen.

Unfortunately we don't seem to have any way to inform userspace that
such configurations are invalid except by returning an error from
setcrtc. possible_clones only covers real cloning situations, and
looking at the connector names doesn't work either since we don't
always register both connectors for the same port. I suppose the
only way to fix that would be to expose only a single encoder per
digital port like we do on HSW+ but that would be a fairly large
undertaking for little gain.

kms_setmode hits this since it forces modes on non-connected VGA and
HDMI connectors. Previosuly it just resulted in weirdness such as
failed link training. With this patch it will now get an error back
from the kernel and will die with an assert since it thinks that the
configuration should be fine.

v2: Deal with INTEL_OUTPUT_UNKNOWN (Paulo)

Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: mask RPS IRQs properly when disabling RPS

Atm, igt/gem_reset_stats can trigger the recently added WARN on
left-over PM_IIR bits in gen6_enable_rps_interrupts(). There are two
reasons for this:
1. we call intel_enable_gt_powersave() without a preceeding
intel_disable_gt_powersave()
2. gen6_disable_rps_interrupts() doesn't mask interrupts in PM_IMR

1. means RPS interrupts will remain enabled and can be serviced during
the HW initialization after a GPU reset. 2. means even if we called
gen6_disable_rps_interrupts() any new RPS interrupt during RPS
initialization would still propagate to PM_IIR too early (though
wouldn't be serviced).

This patch solves the 2. issue by also masking interrupts in PM_IMR, the
following patch fixes 1. getting rid of the WARN. This also makes
intel_enable_gt_powersave() and intel_disable_gt_powersave() more
symmetric.

Since gen6_disable_rps_interrupts() is called during driver loading with
i915 interrupts disabled add a new version of gen6_disable_pm_irq() that
doesn't WARN for this.

Also while at it, get the irq_lock around the whole PM_IMR/IER/IIR
programming sequence and make sure that any queued PM_IIR bit is also
cleared.

The WARN was caught by PRTS after I sent my previous RPS sanitizing
patchset and I could easily reproduce it on HSW. To actually fix it we
also need the next patch.

Reported-by: He, Shuang <shuang.he@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Tune down spurious CRC interrupt warning

We don't really synchronously turn them off from debugfs. We try to
avoid hitting them too badly by waiting one vblank, but apparently the
irq handler can still race through that gap.

Since this isn't really all that important for testcases, only for
debugging CRC issues let's tune it down to a debug message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82602
Cc: Damien Lespiau <damien.lespiau@intel.com>
Acked-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Fix context object leak for legacy contexts

Dynamic context pinning for LRCs introduced a leak in legacy mode.
Reinstate context unreference in i915_gem_free_request for legacy contexts.

Leak reported by i-g-t/drv_module_reload fixed by this patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86507
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: John Harrison<John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/skl: Update in Gen9 multi-engine forcewake range

Updates in forcewake range for Render/Media/Common
power wells for Gen9.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Zhe Wang <zhe1.wang@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/eDP: When enabling panel VDD cancel pending disable worker

Before testing if the panel VDD is enabled on eDP cancel any pending
disable worker. This makes sure the worker will be triggered with a
delay from the last time edp_panel_vdd_schedule_off() is called, not
the first time. This avoids unnecessary overhead.

https://bugs.freedesktop.org/show_bug.cgi?id=86201

v2: use cancel_delayed_work() instead of cancel_delayed_work_sync()
as the pps_mutexes will provide the required serialization with
edp_panel_vdd_work() while the sync variant may deadlock. Suggested
by Ville Syrjälä <ville.syrjala@linux.intel.com>.
Made commit message a bit clearer.

Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Handle runtime pm in the CRC setup code

The crc code doesn't handle anything really that could drop the
register state (by design so that we have less complexity). Which
means userspace may only start crc capture once the pipe is fully set
up.

With an i-g-t patch this will be the case, but there's still the
problem that this results in obscure unclaimed register write
failures. Which is a pain to debug.

So instead make sure we don't have the basic unclaimed register write
failure by grabbing runtime pm references. And reject completely
invalid requests with -EIO. This is still racy of course, but for a
test library we don't really care - if userspace shuts down the pipe
right afterwards the entire setup will be lost anyway.

v2: Put instead of get, spotted by Damien. Also explain the runtime pm
dance.

v3: There's really no need for rpm get/put since power_is_enabled only
checks software state (Damien).

References: https://bugs.freedesktop.org/show_bug.cgi?id=86092
Cc: Damien Lespiau <damien.lespiau@intel.com> (v2)
Tested-by: lu hua <huax.lu@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Disable crtcs gracefully before GPU reset on gen3/4

The GPU reset also resets the display on gen3/4. The g33 docs say we
should disable all planes before flipping the reset switch. Just
disable all the crtcs instead. That seems a nicer thing to do anyway.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Grab modeset locks for GPU rest on pre-ctg

On gen4 and earlier the GPU reset also resets the display, so we should
protect against concurrent modeset operations. Grab all the modeset locks
around the entire GPU reset dance, remebering first ti dislogde any
pending page flip to make sure we don't deadlock. Any pageflip coming
in between these two steps should fail anyway due to reset_in_progress,
so this should be safe.

This fixes a lot of failed asserts in the modeset code when there's a
modeset racing with the reset. Naturally the asserts aren't happy when
the expected state has disappeared.

v2: Drop UMS checks, complete pending flips after the reset (Daniel)

Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Implement GPU reset for g33

g33 seems to sit somewhere between the 915/945/965 style and the
g4x style. The bits look like g4x, but we still need to do a full
reset including display.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Implement GPU reset for 915/945

915/945 have the same reset registers as 965, so share the code.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Restore the display config after a GPU reset on gen4

On pre-ctg GPU reset also resets the display hardware. Force a mode
restore after the GPU reset, and also re-init clock gating.

v2: Use intel_modeset_init_hw() instead of intel_init_clock_gating()
in case more relevant stuff gets added there at some point
Restore interrupts after the reset as well

Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix gen4 GPU reset

On pre-ctg the reset bit directly controls the reset signal. We must
assert it for >=20usec and then deassert it. Bit 1 is a RO status bit
which should also go down when the reset is no longer asserted.

Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Stop gathering error states for CS error interrupts

There's quite a few bug reports with error states where the error
reasons makes just about no sense at all. Like dying on tlbs for a
display plane that's not even there. Also users don't really report a
lot of bad side effects generally, just the error states.

Furthermore we don't even enable these interrupts any more on gen5+
(though the handling code is still there). So this mostly concerns old
platforms.

Given all that lets make our lives a bit easier and stop capturing
error states, in the hopes that we can just ignore them. In case
that's not true and the gpu indeed dies the hangcheck should
eventually kick in. And I've left some debug log in to make this case
noticeble. Referenced bug is just an example.

v2: Fix missing \n Jani spotted.

References: https://bugs.freedesktop.org/show_bug.cgi?id=82095
References: https://bugs.freedesktop.org/show_bug.cgi?id=85944
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Disallow pin ioctl completely for kms drivers

The problem here is that SNA pins batchbuffers to etch out a bit more
performance. Iirc it started out as a w/a for i830M (which we've
implemented in the kernel since a long time already). The problem is
that the pin ioctl wasn't added in

commit d23db88c3ab233daed18709e3a24d6c95344117f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri May 23 08:48:08 2014 +0200

    drm/i915: Prevent negative relocation deltas from wrapping

Fix this by simply disallowing pinning from userspace so that the
kernel is in full control of batch placement again. Especially since
distros are moving towards running X as non-root, so most users won't
even be able to see any benefits.

UMS support is dead now, but we need this minimal patch for
backporting. Follow-up patch will remove the pin ioctl code
completely.

Note to backporters: You must have both

commit b45305fce5bb1abec263fcff9d81ebecd6306ede
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Dec 17 16:21:27 2012 +0100

    drm/i915: Implement workaround for broken CS tlb on i830/845

which laned in 3.8 and

commit c4d69da167fa967749aeb70bc0e94a457e5d00c1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Sep 8 14:25:41 2014 +0100

    drm/i915: Evict CS TLBs between batches

which is also marked cc: stable. Otherwise this could introduce a
regression by disabling the userspace w/a without the kernel w/a being
fully functional on i830/45.

References: https://bugs.freedesktop.org/show_bug.cgi?id=76554#c116
Cc: stable@vger.kernel.org # requires c4d69da167fa967749a and v3.8
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

drm/i915: Only warn the first time we attempt to mmio whilst suspended

In all likelihood we will do a few hundred errnoneous register
operations if we do a single invalid register access whilst the device
is suspended. As each instance causes a WARN, this floods the system
logs and can make the system unresponsive.

The warning was first introduced in
commit b2ec142cb0101f298f8e091c7d75b1ec5b809b65
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Feb 21 13:52:25 2014 -0300

drm/i915: call assert_device_not_suspended at gen6_force_wake_work

and despite the claims the WARN is still encountered in the wild today.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/chv: Enable AVI, SPD and HDMI infoframes for CHV.

CHV infoframes were not being enabled.

Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Don't clobber crtc->new_config when nothing changes

When doing a nop modeset we currently leave crtc->new_config point at
the already freed temporary pipe_config. That will anger the sanity
checks in intel_modeset_update_state() when the nop modeset gets
followed by a GPU reset on gen3/4 where the display block gets fully
reinitialized during the reset.

So leave crtc->new_config alone until we know a modeset is actually
required.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: rcar-du: Fix NULL encoder pointer dereference

The DRM connector's encoder pointer is managed internally by the DRM
core and set to NULL when the DRM connector is disconnected from the
CRTC it was attached to. This results in a NULL pointer dereference in
the HDMI connector functions when trying to call the associated slave
encoder's operations.

Fix this by retrieving the slave encoder pointer from the R-Car
connector structure instead of the DRM connector structure.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge tag 'drm-intel-next-2014-11-21-fixed' of git://anongit.freedesktop.org/drm-intel into drm-next

drm-intel-next-2014-11-21:
- infoframe tracking (for fastboot) from Jesse
- start of the dri1/ums support removal
- vlv forcewake timeout fixes (Imre)
- bunch of patches to polish the rps code (Imre) and improve it on bdw (Tom
  O'Rourke)
- on-demand pinning for execlist contexts
- vlv/chv backlight improvements (Ville)
- gen8+ render ctx w/a work from various people
- skl edp programming (Satheeshakrishna et al.)
- psr docbook (Rodrigo)
- piles of little fixes and improvements all over, as usual

* tag 'drm-intel-next-2014-11-21-fixed' of git://anongit.freedesktop.org/drm-intel: (117 commits)
  drm/i915: Don't pin LRC in GGTT when dumping in debugfs
  drm/i915: Update DRIVER_DATE to 20141121
  drm/i915/g4x: fix g4x infoframe readout
  drm/i915: Only call mod_timer() if not already pending
  drm/i915: Don't rely upon encoder->type for infoframe hw state readout
  drm/i915: remove the IRQs enabled WARN from intel_disable_gt_powersave
  drm/i915: Use ggtt error obj capture helper for gen8 semaphores
  drm/i915: vlv: increase timeout when setting idle GPU freq
  drm/i915: vlv: fix cdclk setting during modeset while suspended
  drm/i915: Dump hdmi pipe_config state
  drm/i915: Gen9 shadowed registers
  drm/i915/skl: Gen9 multi-engine forcewake
  drm/i915: Read power well status before other registers for drpc info
  drm/i915: Pin tiled objects for L-shaped configs
  drm/i915: Update ring freq for full gpu freq range
  drm/i915: change initial rps frequency for gen8
  drm/i915: Keep min freq above floor on HSW/BDW
  drm/i915: Use efficient frequency for HSW/BDW
  drm/i915: Can i915_gem_init_ioctl
  drm/i915: Sanitize ->lastclose
  ...

drm/i915: Don't pin LRC in GGTT when dumping in debugfs

LRC object does not need to be mapped into the GGTT when dumping. A side-effect
of this patch is that a compiler warning goes away (not checking return value
of i915_gem_obj_ggtt_pin).

v2: Broke out individual context dumping into a new function as the indentation
was getting a bit crazy. Added notification of contexts with no gem object for
debugging purposes. Removed unnecessary pin_pages and unpin_pages, replaced
with explicit get_pages for the context object as there may be no backing store
allocated at this time (Comment for get_pages says "Ensure that the associated
pages are gathered from the backing storage and pinned into our object").
Improved error checking - get_pages and get_page are checked for failure.

Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
[danvet: Align paramter continuation lines properly. Also add some
braces to the nested loops again for readability.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Merge branch 'linux-3.19' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next

- Tegra K1 voltage support, and coherency improvements
- GM204 support (modesetting, still waiting on NVIDIA for signed fw to
proceed further), and a lot of bios/i2c/devinit adjustments needed to
support it
- GT21x memory reclocking work
- Various other bits and pieces, most of which are prep-work for a
couple of bigger projects I didn't get finished in time

* 'linux-3.19' of git://anongit.freedesktop.org/git/nouveau/linux-2.6: (73 commits)
  drm/nv50/kms: drop requirement that framebuffer bos be contig up-front
  drm/nv50/kms: directly use cursor image from userspace buffer
  drm/nouveau/kms: when pinning display-related buffers, force contig vram
  drm/nouveau: teach nouveau_bo_pin() how to force a contig vram allocation
  drm/nouveau/volt: add support for GK20A
  drm/nouveau/platform: add GPU speedo information to nouveau platform
  drm/nouveau/volt: allow non-bios voltage scaling
  drm/gf100-/gr: return non-fatal error code when fw not present
  drm/nouveau/devinit: bump priv ring timeouts before executing scripts
  drm/nouveau/bios: translate ramcfg strap through M0203
  drm/nouveau/fb: make use of M0203 routines for ram type determination
  drm/nouveau/bios: add parsing of BIT M(v2) +0x03 table
  drm/nouveau/core: allow vbios parsing without knowing chipset type
  drm/nouveau/lib: add null backend
  drm/nouveau/device: store revision
  drm/nouveau/core: add some forgotten subdevs to disable mask
  drm/gk20a/clk: fix max VCO value
  drm/nouveau: we need pin_refcnt for nouveau_bo_placement_set()
  drm/nv50-/kms: add some evo tracing ability for debugging
  drm/nv50/kms: use sclass() instead of trial-and-error
  ...

drm/nv50/kms: drop requirement that framebuffer bos be contig up-front

We'll move them at pin() time if necessary.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50/kms: directly use cursor image from userspace buffer

Preparation for transition to planes, which use framebuffers for the
cursor image. We've always done copies from the userspace buffer up
until now for legacy reasons, there's no good reason to do so on the
chipsets this code covers.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/kms: when pinning display-related buffers, force contig vram

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: teach nouveau_bo_pin() how to force a contig vram allocation

We have the ability to move buffers around in the kernel if necessary,
and should probably use it rather than failing if userspace passes us
a non-contig buffer for a plane.

The NOUVEAU_GEM_TILE_NONCONTIG flag from userspace will become a mere
initial placement hint once all the relevant paths have been updated.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/volt: add support for GK20A

The voltage value are calculated by the hardware characterized
result.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/platform: add GPU speedo information to nouveau platform

For GK20A we need the GPU speedo value to calculate voltage levels.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/volt: allow non-bios voltage scaling

Move the vbios parsing out of init() and call it conditionally if the
platform has a vbios. Non-vbios platforms can use the ctor() to init the
data structures.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/gf100-/gr: return non-fatal error code when fw not present

This allows the module to load without acceleration.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/devinit: bump priv ring timeouts before executing scripts

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/bios: translate ramcfg strap through M0203

A machine has been spotted where the ramcfg strap is "8", and the ramcfg
xlat table goes 0-7,0-7, resulting in us selecting config 0 for memory
items.  On this particular system, config "8" is available and supposed
to be used.  It appears that starting from GT21x (where Mv2 appears),
we're supposed to use the value in this table instead.

One concern here is that not all the places we currently use ramcfg xlat
are supposed to be treated the same now.  The strap xlat table wasn't
removed from the vbios either, presumably for some kind of good reason.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/fb: make use of M0203 routines for ram type determination

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/bios: add parsing of BIT M(v2) +0x03 table

We only support one kind of matching here (ramcfg strap), but it appears
alternate methods are possible. I wrote a tool to scan our vbios repo
for other types, but did not see any used. Hopefully this means there
aren't any in the wild that will now break.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/core: allow vbios parsing without knowing chipset type

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/lib: add null backend

For the moment, just used to speed up vbios-only testing. Have some
ideas for extending in the future.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/device: store revision

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/core: add some forgotten subdevs to disable mask

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/gk20a/clk: fix max VCO value

For some reason max_vco was set to a lower value that it can support,
which prevented some clock states to be applied. Fix this by setting it
to the same value as downstream.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: we need pin_refcnt for nouveau_bo_placement_set()

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50-/kms: add some evo tracing ability for debugging

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50/kms: use sclass() instead of trial-and-error

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50/kms: remove a couple of cursor-related stub functions

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix pin refcnt leak in failure path

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: synchronize BOs when required

On architectures for which access to GPU memory is non-coherent,
caches need to be flushed and invalidated explicitly when BO control
changes between CPU and GPU.

This patch adds buffer synchronization functions which invokes the
correct API (PCI or DMA) to ensure synchronization is effective.

Based on the TTM DMA cache helper patches by Lucas Stach.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: allocate GPFIFOs and fences coherently

Specify TTM_PL_FLAG_UNCACHED when allocating GPFIFOs and fences to
allow them to be safely accessed by the kernel without being synced
on non-coherent architectures.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: implement explicitly coherent BOs

Allow nouveau_bo_new() to recognize the TTM_PL_FLAG_UNCACHED flag, which
means that we want the allocated BO to be perfectly coherent between the
CPU and GPU. This is useful on non-coherent architectures for which we
do not want to manually sync some rarely-accessed buffers: typically,
fences and pushbuffers.

A TTM BO allocated with the TTM_PL_FLAG_UNCACHED on a non-coherent
architecture will be populated using the DMA API, and accesses to it
performed using the coherent mapping performed by dma_alloc_coherent().

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: introduce nv_device_is_cpu_coherent()

Add a function allowing us to know whether a device is CPU-coherent,
i.e. accesses performed by the CPU on GPU-mapped buffers will
be immediately visible on the GPU side and vice-versa.

For now, a device is considered to be coherent if it uses the PCI bus on
a non-ARM architecture.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: warn when moving a pinned object

Pinned BOs are supposed to remain in their current location until
unpinned. Display a warning for the supposedly-erroneous case where we
are trying to move such objects.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>